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THE DETERMINATION OF SUCCESSIVE PRINCIPAL 
COMPONENTS WITHOUT COMPUTATION OF 
TABLES OF RESIDUAL CORRELATION 
COEFFICIENTS* 


LEDYARD R. TUCKER 


PSYCHOMETRIC LABORATORY 
THE UNIVERSITY OF CHICAGO 


A procedure is presented for determining the successive princi- 
pal components of a correlation matrix where it is not necessary to 
compute the successive tables of residual correlations. The original 
correlation matrix is bordered with a new row and column for each 


principal component that is determined. 


The calculation of tables of residual coefficients of correlation 
has been one of the most laborious processes in the resolution of a set 
of variables into their principal components. Starting with an origi- 
nal matrix of correlations, R, , the coefficients of the variables on the 
first principal component are determined. The entries in the table of 
residuals are computed by the formula 


V2. jk = V1.jk — U1 Ua , (1) 


where 72.;, is the residual coefficient, 7,.;, is the original correlation, 
@;, and 4, are the coefficients of variables 7 and & on the first princi- 
pal component. The variable coefticients, a;, and 4, on the second 
principal component are determined from the matrix R, . 

A simpler procedure for obtaining the second principal compo- 
nent is to border the original matrix R, by a new row and column as 


follows: 


* This paper is one of a series of reports on the development of multiple fac- 
tor analysis in the study of primary human abilities which have been supported 
by research grants from the Carnegie Corporation of New York and the facilities 
of the Psychometric Laboratory, which have been provided by the Social Science 
Research Committee of The University of Chicago. 
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where k, = 3 a@?;, and 7 is the imaginary number V-1. The first 


principal component of this enlarged matrix R, is the second princi- 
pal component of the original matrix R, . 

The foregoing relation can be demonstrated by considering the 
matrix A containing the coefficients on all of the principal compo- 
nents. This matrix has the properties that 

AA’'=R, (2) 
and 

A‘'A=K, (3) 
where K is a diagonal matrix with the diagonal element for the mth 


principal component 
kn => O* im (4) 


A new matrix, Ay , can be formed by adding a new row with a first 
entry of \/k,i and all other entries of zero as follows: 








4, Gin Ay 
Qo, Az» A23 
Qj, Qj a; 

Any a 12 Anz 





Vk 0 0 
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Then 

Ay A’'yn = Ry (5) 
and 

A’, An = Ku. (6) 


It will be noted that the only entry in K which is altered in going to 
Ky is the first diagona!. This first diagonal has a value of 


koa = = a, + (Vii t)?, (7) 


which by (4) becomes 
Kya =k, -h=0. (8) 


Since K,, is a diagonal matrix, Ay, contains the principal components 
of Ry , but equation (7) shows that the first column of Ay has a van- 
ishing k and thus is not the first principal component of Ry. Hence 
the first principal component of R,, is the second column of Ay, , which 
is the second principal component of R, . 

When a third principal component is desired, a matrix Ry, can 
be formed by bordering Ry, with a column and row with elements 
Qj2V kt. The diagonal element of this new row and column is —k.. 
The proof that the first principal component of Ry is the third prin- 
cipal component of R is identical with the foregoing proof for Ry. 
It is desirable to know the variance of the correlations that is left 
unaccounted for after each factor, and this can be found by the re- 
lation 

m-1 
DD 1 m- je = SD 1-1k — J Ky. (9) 
k § k p=1 


y] 


When Hotelling’s iterative method (1) is being used and has 
been accelerated (2) by raising R to the power ¢ so as to iterate on 
R', the same acceleration can be obtained by bordering R‘ by a row 
and column with elements a,, k,‘/? 7 and diagonal —k’. 

Table 1 represents a fictitious matrix R, which is used to illus- 
trate the procedure for finding successive principal components with- 
out calculation of residual coefficients. The coefficients on the first 
principal component, a;,, are also given in Table 1 as are values of 
k, and V/k,. Ry is given in Table 2. R, has been bordered by a row 
and column with side entries a;, /k,i. The diagonal entry of the 
new row and column is —k,. Hotelling’s iterative method was applied 
to Ry, and Table 3 presents the first eight iterations. It will be noted 
that the successive trial values approach being proportional to the 
coefficients on the second principal component given in Table 2. The 
m: ‘trix Ry, is given in Table 4. This matrix is found by bordering 
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Ry of Table 2 with a row and column with elements aj. \/ i and di- 
agonal of —k,. The third principal component was determined from 
Rin by Hotelling’s iterative method and is listed in Table 4. 


REFERENCES 


1. Hotelling, Harold. Analysis of a complex of statistical variables into princi- 
pal components. J. educ. Psychol., 1938, 24, 417-441, 498-520. 

2. Hotelling, Harold. Simplified calculation of principal components. Psycho- 
metrika, 1986, 1, 27-35. 
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TABLE 1 
R, 
1 2 3 4 
04 .04 08 -.10 
04 .29 18 08 
08 18  .56 16 
-.10 08 16 61 
&, == HA vk, =.9 
TABLE 2 
Ry 
2 3 A I 
04 08 -.10 001 
29 18 08 271 
18 56 .16 54i 
.08 16 61 54¢ 
27% 54t 544 -.81 
Rw vig 7 
TABLE 3 
Successive Iterations on R,, 
3 i 5 6 
3882 .400 .400 .400 
1.000 667 508 445 
449 667 .747 ~~ .780 
-.949 -1.000 -1.000 -1.000 
000 000 000 .002: 
TABLE 4 
Riz 
2 3 a I II 
04 08  -.10 O0i 14% 
.29 .18 08 27 4G 
.18 56 16 544i — .28% 
.08 16 61 54i -.35t 
27) 54t KG 81 .00 
144 281 = -.352 00 -.49 


jl 
0 
3 
6 
6 
O52 
2 
2 
A 
-5 
0 
4 8 
399 .400 
.418 .406 
-790 -796 
-1.000 -1.000 
.000 -.0222 
55 
0 
A 
-2 
0 
0 
0 
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FACTORING TEST SCORES AND IMPLICATIONS 
FOR THE METHOD OF AVERAGES 


KARL J. HOLZINGER 
THE UNIVERSITY OF CHICAGO 


The general procedure and detailed steps for attaining com- 
plete factor analyses of scores are presented. Both orthogonal and 
oblique factors are considered. It is shown that a single average by 
conventional procedure gives an- incomplete summarization of the 
data when the rank exceeds one. There should be as many aver- 
ages as there are common factors. 


I. General. Theory 


In factoring correlations it is customary to assume linear rela- 
tionships between variables and factors of the form, 


W, = 4,, G, + Are Go + +++ + Aim Ga 


Wz = Ap, G, - Ase G, tee + dem Gn 
(1) 


Wa = On, G, + Ong Go + +++ + Can Gu 
or 
W; = 4;, G, + Ajo Go + +++ + Ain Gn, (1)’ 
where 
w (j=1,2,38,---,n) 
are the variables, 
G,(s=1,2,8,---,m) 


are factors, and a@;, are coefficients in these linear expressions. The 
variables and factors are usually taken in standard form (deviates 
from means divided by standard deviations). In the present discus- 
sion, however, all such variables will be taken in normalized form. 
Thus if X; is a variable with mean M,, then x; = X; — M; and the 
normalized value becomes 
vj 
10 ee arene, (2) 
VE 2; 
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Variables in normalized form have the advantage of greater sim- 
plicity than standardized values, both analytically and geometrically. 
If ((=1,2,3,---,N) indicates the range of individuals, then equa- 
tion (2) may be written more precisely 


LX ji 


: ‘ 
> 25 
fot 


The correlation between two normalized variables w; and w, may 
then be written in the form 


W 
r.=> Wii Wri- (3) 


=1 


(2)’ 


Wi = 





Obviously 
w;,=1 (4) 


‘Me 


it 


t=1 


and w;; may be interpreted as direction cosines. 

Inasmuch as a variable may be considered as a finite set of 
scores*, equations (1) may also be written as a set of relationships 
between scores on tests (or other variables) and factor scores. Equa- 
tion (1)’ may then be written in the form 


W 55 = Aj Jii + jo Joi + => Dy, Imi - (5) 


This expression will next be written in matrix form. 


W314 Wie ose Wy | 
| We Woo tee Wey || 
Let W,,= | rf ” , 
| . . . . 
|} Wnt Wne hag Win 


denote the matrix of normalized scores. 


* The reason the analysis of scores has been overlooked so long is probably 
due to the fact that alternate interpretations of the word “variable” have not 
been clear. If w, denotes the variable “height’”’ we may imagine a continuum on 
which an indefinitely large number of values may be indicated. If a finite set 
of heights 

W,;: || 62, 63, 69, 65, 64, --- , 68 || 


is given, this row matrix or “vector” may also be considered as a variable. It is 
this latter interpretation of “variable” that makes possible the geometric vector 
representation of variables, and suggests the factoring of scores instead of corre- 
lations. 
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Ay Qy2 si Aim 
— || Ge Age oe Qom 
Any Ane ashe Qnm 


denote the matrix of factor coefficients. 








9u Jie =? Yin 
Let Gi= Jo21 Je2 si Jon 
Imi Jm2 i Imn 


denote the matrix of factor scores. 


Equation (5) may then be written as 


Wi — Aj; Gsi (6) 
or for purposes of calculation as 
Wyi=Aj Gi; + Aje Goi +--+ + Ajm Gai « (6)’ 


There are two main problems in a factor analysis: the determina- 
tion of A;, and of G,;. The calculation of A;, yields what is known 
as a “pattern” of the form (1) while the elements of G,; are the fac- 
tor scores of the individuals. In factoring correlations, the matrix A;, 
is first determined by one of the several methods available, and then 
factor scores g,; are calculated directly or estimated. In factoring 
Scores directly, this order is reversed, 7.e., the values g,; are computed 
by a simple process of averaging, and then the matrix A;, may be 
obtained. 


II. Orthogonal Factors 

Consider first the case of orthogonal centroid factors. The first 
centroid factor scores g,; may be defined as those obtained by nor- 
malizing the totals (or averages) in the columns of W;; (Table 3). 
This calculation yields the row matrix G,,. The columns of the ma- 
trix A;, may then be found in the following manner: Postmultiply 

both sides of (6)’ by G,, to obtain 

Wi Gis = Aji [Gii Gis] + Ajo [Goi Gis] 

+++ + Ajm[Gmi Gir] . (7) 


For orthogonal factors of the form here employed, the first square 
bracket becomes the identity matrix and the remaining terms vanish, 
yielding ; 

Wii Gi = Aj, (8) 
as the first column of Aj, . 
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The first term of the right-hand member of equation (6)' may 
then be determined by the matrix product 


Aj, Gii=1Wji, (9) 


which may be interpreted as the part of W;; attributable to the fac- 
tor G,. 

The elements of ,W;; are next subtracted from those of Wj; to 
give the residual matrix W;; — .W;; of Table 4. Next the signs of 
the entries in certain rows of this last matrix are changed to “remove 
the centroid from the origin,” and to make the new column totals as 
large as possible. The columns are then added and the totals nor- 
malized as before to yield the second-factor scores G2; (Table 4). 

Postmultiplying both sides of equation (6)’ by G;. there results 


Wi Giz =Aje. (10) 
The product 

Ajo Goi = Wi (11) 
gives the portion of W;; attributable to the second centroid factor 
G, . Subtracting the elements of .W;; from those of Wj; — ,.W;; yields 
the matrix of second residuals W;; — ,.W;; — .W;;. This process may 
be continued until the final residuals are considered negligible. In 
the present illustration the original matrix is an artificial one of 
exactly rank two, so the second residuals of Table 5 are zero within 
errors of rounding. 

The solution thus obtained will be identical with that by the 
usual centroid method based on correlations with unities in the diag- 
onal. We shall not be concerned in the present paper with the vexing 
problems of communalities or “when to stop factoring,” but merely 
present methods for factoring “whole scores” until the residual ma- 
trix contains entries considered practically unimportant. 


III. Steps in the Computation of the Centroid Solution 


The detailed steps in the calculations of the centroid solution 
will next be presented in such a manner that they may be followed 
by the student in routine fashion once he has mastered the simple 
operation of matrix multiplication.* 

Step 1. Arrange the scores in a matrix with tests identified by 
rows and persons by columns (Table 1). 

Step 2. Calculate the means of the rows of this matrix (Table 1). 

Step 3. Subtract the mean of each row from each entry in the 
row of this matrix to obtain the deviates from the means. Obtain the 


* Karl J. Holzinger and H. H. Harman, Factor Analysis, Appendix A. Chi- 
cago: University of Chicago Press, 1941. 
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sum of the squares of the entries in each row for subsequent calcu- 
lation (Table 2). 

Step 4. Divide each entry in Table 2 by the square root of the 
sum of the squares for each row to give the normalized deviates 
w;: of Table 3. The sum of the squares of these values should be 
1.000 for each row (check). 

Step 5. Add the columns of W;; (Table 3) to give the “column 
sums,” and normalize these values in the manner employed to obtain 
the values w;; in the body of Table 3. The last row of numbers in 
this table are the values g,; , the first centroid factor scores. The sum 
of their squares should be checked as 1.000. 

Step 6. Perform the matrix multiplication W;, Gi; = Aj, as il- 
lustrated below: 


-.6708 -.2236 .2236 .6708 ! | —.8312 | || .8997 
-.7071 4714 4714 -.2357 ! | .0481 || 6984 


-.7487 -—1248 .2912 .5823 || x 3855 ||-— || .9601 
-.5607 -.3271 .1402 .7476 .3976 .8016 
-.7720 4044 .4779 -.1103 8015 








The values a;, at the right are the coefficients of the first centroid 
factor G, in the pattern (6). 

Step 7. Perform the matrix multiplication A;, G,; = .W;; as fol- 
lows: 


| 8997 C5806 0433 .3468 .3577 


6984 5805 .0336 .2692 .2777 
9601 ||X ||-.8312 .0481 .3855 .3976||=|+.7980 .0462 .3701 .3817 
8016 -.6663 .0386 .3090 .3187 
8015 | +.6662 .0386 .3090 .3187 








The matrix on the right, ,W;;, shows the part of the score matrix 
W;; attributable to the first centroid factor. 

Step 8. Subtract the elements of ,W;; from those of W;; to ob- 
tain the residual matrix of Table 4; e.g. the top left element of Table 4 
is —.6708 — (—.7478) = .0770, etc. 

Step 9. The totals of the columns in Table 4 are zero, so the cen- 
troid must be removed from the origin. This is accomplished by 
changing the signs of the entries in the rows of the table to make the 
totals positive and large. In Table 4 the signs of rows 2 and 5 were 
changed because they had the largest negative values. A similar 
scheme might be followed with actual data. 

Step 10. Calculate the sums of the columns and the normalized 
sums of the matrix of Table 4 (with sign changes) in same manner 
as in Step 5. The bottom row of Table 4 consists of the elements G,; , 
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the second centroid factor scores. 
Step 11. Form the product W;; Giz = Aj. as shown below: 


-.6708 -.2236  .2236 .6708 1767 .4364 

-.7071 .4714 .4714 -.2357 -.6117 —.7155 

-.7487 -.1248 .2912 .5823 || x || -.2824 ||—|| .2796 

-.5607 -.3271 .1402 .7476 .7175 5978 

-.7720 .4044 4779 -.1103 || -.5979 

Step 12. Form the product A;. G.; = .W;; as follows: 
4364 | 0771 -.2669 -.1232 .3131 
-7155 ~.1264 .4877 .2021 -.5134 
.2796 || X |j.1767 -.6117 —.2824 .7175|| =|] .0494 —1710 -.0790 .2006 
5978 .1056 -.3657 -.1688 .4289 
5979 -.1056 .38657 .1688 —.4290 





This last matrix, .W;; , represents the part of the original scores Wj; 
attributable to the second centroid factor G.. 

Step 18. Subtract the matrix, .W;;, from the first residual ma- 
trix of Table 4 to obtain the matrix of Table 5 with both factors re- 
moved. With actual data, the above process may be continued until 
the residuals may be considered negligible. 

Step 14.- The two parts of the solution obtained are the factor 
pattern A;, calculated in Steps 6 and 11; and the factor scores of the 
individuals, G,; , obtained from the bottom rows of Tables 3 and 4. 
These two matrixes may be then written in the form of Tables 6 and 
7. This last table illustrates one of the fundamental characteristics 
of factor analysis. The eight factor scores of Table 7 contain all the 
essential information about the four persons originally indicated in 
the matrix of Tables 1 or 3 by twenty scores. 


IV. Oblique Factors 

A solution involving correlated factors may be obtained much 
more simply than in the case of orthogonal factors in case the matrix 
W;; can be sectioned (by rows) so that each sub-group of tests may 
be considered as measuring a single factor. The normalized totals for 
these sections then become the oblique factor scores denoted by 1,i 
(Table 9). Such factors may be extracted all at once instead of one 
at a time as in the orthogonal case. Let 


W ji =Aijs Li (12) 


represent the factor pattern for the oblique factors L,. Postmultiply- 
ing both sides of (12) by Li; gives 











31 
34 
06 
39 
0 
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Wii Lis na Ajs [Lsi Lis] — Ajs dss = Structure* — Sis ’ (13) 


where ¢,, is the matrix of correlations between factors, and S;, is 
the matrix of the structure values, which are correlations between 
tests and factors. It will be noted that A;, is identical with S;, only 
in the case of orthogonal factors. In this case ¢;, equals the identity 
matrix I. 

The pattern A;, can be obtained by postmultiplying both sides of 
equation (13) by ¢s.? to give 


Ajs — Sis oss ° (14) 


The calculations are most readily done by the Doolittle} or similar 
method. . 


V. Steps in the Calculation of Oblique Factors 


Step 1. Rearrange the rows of the normalized test scores Wj; 
of Table 3 according to the content of the tests or other criteria 
(Table 8). The basis for the grouping of the variables in this ex- 
ample was the agreement in algebraic signs of the scores. The points 
representing the variables in vector form thus fall into two distinct 
“sedeciments.”’t 

Step 2. Add the scores in these sections and normalize the to- 
tals as in Step 5 of the preceding section to give the oblique factor 
scores J,; and l,; (Tables 8 and 9). 

Step 3. Postmultiply W;; by the transpose of L,; to obtain S;, 
of equation (13) as shown below: 


-.7071 .4714 4714 -.2357 -~.7418 —.6671 9969 .3096 
-.7720 .4044 .4779 -.1103 43892 —.2276 9969 .4541 
-.6708 —.2236 .2236 .6708)| x || .4761 .2207||/—||.3895 .9998 
-.7487 -.1248 .2912 .5823 1735 .6740 5582 al 


-.5607 -.3271 .1402 .7476| .2093 .9833 


| 








Step 4. Obtain A;, from equation (14) by the Doolittle method, 
where 


ot. t..— | Oe 383 | 
$e=Lulu= | 363 1.000 | 
The result of this computation is 


* Ibid., pp. 325-27. 
+ Ibid., pp. 386-87. 
t Tbid., p. 252. 
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|| 1.0293 
9645 
_ 0077 
|| .1888 
| --1961 


Aj, 
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—.0846 
0847 | 
9968 
9123 | 


1.0584 





Step 5. From the values obtained in Steps 2 and 4 determine 


»W;; = A;, L,; as “pattern scores”: 


| 1.0293 -.0846 

9645 0847) | . 

0077 9968 | x ms : 
1888 9123 | 

|| -.1961 1.0584 


4592 A761 


.2207 


1735 | 
6740 || Vis. 


When this last multiplication for ,W;; is carried out, the entries are 
found to agree with those of Table 8 with a maximum discrepancy 














of .0003. 
With actual scores the discrepancies might be large enough to 
G: 
d (.40, .72) 
4 4(.80,.60) 
/ ° 
/ 
/ 16.90, .44) 
/ 
Ps 56.16, 28) 
a(-83,.18) * / wl 
Pr ia / 
4 — Ge <a ” 0 / 22 
i G, 
2b 4 Goa 
i] ‘ 
~ 
i ~ 
\ ‘“ 
AS i ¢(.39, -.28) 
i 
\ $5 (£0, -,bo) 
bos, -.6/) e 
° 
2(.70, - 72) 











FIGURE 1 
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justify a resorting of the variables and recalculation of the whole 
solution. 


VI. Geometric Interpretation of the Data 


A simple geometric interpretation of the foregoing analysis may 
be made because all the data are contained in a two-space The nor- 
malized deviates may be considered as the coordinates of points equi- 
distant from the origin. In Figure 1, the coordinates of the five test 
points with respect to the G, — G, axes are obtained from Table 6. 
The dotted lines to the points a, b, c, and d are the projections of the 
person axes on the plane here shown, the coordinates of the points 
being obtained from Table 7. The projections of these points in turn 
on the axis for Test 1 (for example) are also shown in the figure, the 
numerical values being those of the top row of Table 3. These values 
show the amount of Test 1 possessed by each of the four persons. The 
first coordinate of each of the points a, b, c, and d shows the amount 
of G, possessed by the person, while the second coordinate indicates 
the amount of G. he possesses. 





Gy 
























































FIGURE 2 
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In Figure 2 the coordinates of the test points have been indicated 
as projections on the G, and G, axes. The projected axis Oa and the 
projections of its end point a on the G, and G, axes are also shown. 


) 
VII. Some Implications of the Analyses 


These analytic and geometric interpretations illustrate short-com- 
ings of certain elementary statistical procedures. If the problem were 
to summarize the data of a table such as Table 3, the usual statistical 
procedure would be to add the columns and obtain averages. (In Table 
3 the last row of numbers are merely normalized averages proportion- 
al to averages.) Here the ordinary investigation would stop, but it is 
quite apparent that only a part of the information about the data is 
thus obtained, viz., G,;. Geometrically, this means that although the 
data are in a two-space, only projections on the G, axis are considered. 
To complete the summarization of the data for the individuals, G2; 
should be calculated, or geometrically, the projections on the Gz axis 
should be made. A complete summarization could also be made, of 
course, by averaging the scores in sections as shown by the oblique 
analysis, but by either method, a complete analysis of the data must 
invoive as many averages as there are factors. 

The ordinary analyst might suppose that, even if several factors 
were involved in his data, a single average would somehow take them 
all into account. While this is true to some extent, much important 
information may still be ignored. 

From these considerations it is apparent that a single average as 
a complete summarization is justified only if the data are of rank one; 
that is, only if one common factor is involved. To employ a single 
average for data of higher rank is to summarize the material incom- 
pletely. It would therefore appear that people who use the method of 
averages should be also familiar with the methods of factor analysis. 
Since nearly everybody uses the method of averages—perhaps that 
would be asking too much. 
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._ TABLE 1 
Raw Scores of Four Persons on Five Tests 
Test ins Mean 
a b c d 
Ee eee eee 1 Z 3 4 2.5 
ie Set eee 1 6 6 3 4.0 
| ee were es a 12: 26 36 43 29.0 
Re eee 9 14 24 387 21.0. 
Ded iakscseas 4 20 21 13 14.5 
TABLE 2 
Deviates from Means 
v5 
Person Sum of 
Test 
" b , d Squares 
; eee Bagel eet —15 —0.5 0.5 1.5 5 
: ee Te — 3.0 2.0 2.0 — 1.0 18 
| ean —18.0 —380 7.0 14.0 578 
: eee eee te —12.6 —7.0 3.0 16.0 458 
Bee —10.5 5.5 65 — 1.5 185 
TABLE 3 
Normalized Deviates 
Person Sum of 
ate a b c d Squares 
, Eee en ene — .6708 —.2236 .2236 .6708 .9999 
2 — .7071 4714 4714. —.2357 1.0000 
5 — .7487 —.1248 .2912 .5823 1.0000 
4 — .5607 —.3271 .1402 -7476 .9999 
5 — .7720 4044 A779 = —.1108 1.0001 
Column sums........ —3.4593 20038 =: 11.6043 1.6547 17.3187 
Normalized sums..| — .8312 .0481 .8855 3976 .9999 
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TABLE 4 
First Residuals 


Wj, — Wii 

















ji 
l 
Test Person 
a b c d 
| Sere | 0770 —.2669 —.1282 8181 
_ Ses | —.1266 4878 2022 —.5184 
cae ae cere .0493 —.1710 —.0789 .2006 
USE aE ee ee 1056 —.38657 —.1688 4289 
Ee Oe ee ee —.1058 3658 1689 —.4290 
Change signs of | 
rows2and5;_ | 
Column sume.............. | 4643 _—1.6072 —.7420 1.8850 
Normalized sums....... | 1767 — .6117 —.2824 -7175 


























TABLE 5 
Second Residuals 
Wii, 1M ji — W755 
Test Person 
a b c d 
_ Ee eee —.0001 .0000 .0000 .0000 
a eae ee —.0002 .0001 .0001 .0000 
OO ee | —.0001 .0000 0001 .0000 
Bees Senhora so .0000 .0000 .0000 .0000 
ee eens. | —.0002 .0001 .0001 .0000 
TABLE 6 
Centroid Solution 
Ajs 
Test | A;, Aj, 
“eee | 8997 4364 
ASS eae .6984 —.7155 
Bistvs oe Se a .9601 .2796 
he eM eeeeneree .8016 .5978 
re, Pekin &- 7 8015 —.5979 























KARL J. HOLZINGER 

















167 



























































TABLE 7 
Factor Scores for Two Centroids 
Gis 
Person G, 
Mie Set acre ooh ea eee —.8312 .1767 
 Negaea ene eee .0481 —.6117 
itt hee ed 8855 .2824 
| ES SRE 3976 -7175 
TABLE 8 
Normalized Deviates 
(Variables rearranged from Table 3) 
Test Person om of 
a b c d > seni 
2 Tae Mier —.7071 4714 4714 —.2357 
Beet —.7720 4044 4779 —.1103 
Total —1.4791 8758 .9493 —.3460 3.975649 
Be oo ae —.6708 —.2236 .2236 .6708 
Bi ier eesits —.7487 —.1248 .2912 .5823 
eee —.5607 —.3271 .1402 -T476 
Total —1.9802 —.6755 6550 2.0007 8.809318 
TABLE 9 
Values of J,; and J; 
(Normalized sub-totals of Table 8) 
Person Sum of 
Factor 
~ b " d Squares 
_ nena —.7418 .4392 4761 —.1735 .9999 
_ —.6671 —.2276 2207 .6740 .9998 
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A NOTE ON CORRELATION CLUSTERS AND CLUSTER 
SEARCH METHODS 


RAYMOND CATTELL 
DUKE UNIVERSITY 


Four methods of determin ng the clusters in a correlation ma- 
trix are described and compa: The choice of method has to be 
made according to the size of the matrix and the type of cluster 
— The relativity of clusters is emphasized and a distinction 
is drawn between phenomenal clusters and nuclear clusters. The 
relative utility of clusters and factors is briefly commented upon. 


1. The Setting of the Problem in Relation to Personality Research 

Since the study of individual differences aspired to become an 
exact science, psychology has had, as one of its major enterprises, the 
reduction of an almost endless variety of tests and ratings to a com- 
paratively small number of representative variables. In this enter- 
prise, especially as it concerned abilities, factor analysis has been 
unquestionably the main instrument. But since the wave of pioneer 
research entered the realm of personality variables there has ap- 
peared, among an appreciable number of workers (2, 6, 7, 9, 10, 11, 
12, 14, 15) a preférence for reduction into ciusters rather than, or 
in addition to, factors. This may or may not prove justifiable, but 
since the discovery of clusters is, in any case, a valuable orientation- 
giving step, preliminary to factor analysis (especially in the new 
grouped centroid method of Thurstone, or Burt’s analysis by sub-ma- 
trices (1) or in the present writer’s use of a reduced “personality 
sphere” (4), the techniques of cluster study deserve more attention 
than has hitherto been given to them. 

The present note offers some brief observations, arising from ex- 
perience with a cluster analysis carried out on a larger scale than has 
previously been reported, concerning (1) the practical problems of 
establishing clusters, and (2) the relative utility of factors and 
clusters.* 

* The research from which these observations primarily arise was directed 
to discovering the basic factors in 171 personality variables (4). This very com- 
prehensive list of variables was first resolved into 35 clusters, of from three to 
ten variables in each, and which included all but two of the original 171 vari- 
ables. The 35 cluster variables were then assessed in relation to a larger popu- 


lation than could be rated and handled with accuracy for the 171 variables, and 
the inter-correlations factored into some eleven factors (5). 
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2. Four Methods of Determining Clusters 

When only one or two.dozen variables are involved the proce- 
dure of discovering clusters by direct inspection of the correlation 
matrix is comparatively simple and adequate. But when the inter- 
correlations become more numerous, e.g., reaching 14,535 in our re- 
search, more systematic methods are necessary if some clusters are 
not to be overlooked. It is perhaps because of the smallness of the 
matrices so far encountered that so little has been written by way of 
practical guidance on systematic procedures. 

From publications and from discussions with those recently en- 
gaged in cluster studies, one may gather that four principal methods 
are at present in use. They can contingently be labelled as follows: 


(1) The Ramifying Linkage method. 

(2) The Matrix Diagonal method. 

(3) Correlation Profile Correlation (Tryon’s method). 

(4) The Approximate Delimitation (or Convergence) method. 


All methods require, directly or indirectly, that the experimenter 
shall begin by fixing some lower limit to the magnitude of correlation 
coefficient which will be accepted as qualification for entry to a clus- 
ter. The qualifying variable must manifest either (a) a mean corre- 
lation with the other members of the cluster, and/or (b) its lowest 
correlation with any other cluster member, exceeding the given lower 
limit. In general (b) has proved more practicable and been more 
widely employed. Many studies have proceeded, for example, accord- 
ing to the rule that a variable can be added to a cluster only if it cor- 
relates 0.45 or more (uncorrected for attenuation) with every other 
variable therein. Typically this results in a mean inter-correlation 
within the cluster of between 0.5 and 0.7. When some minimum of 
this kind has been fixed, according to the general nature of the data 
and the range of coefficients found therein, cluster search becomes 
first a matter of looking for linkages, a linkage being defined as a 
“significant” correlation, i.e., something above the agreed minimum. 
One must begin, therefore, by marking on the correlation matrix, e.g., 
by colored encirclements, the coefficients which constitute linkages. 
From this point on the necessary steps differ, according to the meth- 
od followed, as the accounts below indicate. 

A. The Ramifying Linkage method. This is the simplest meth- 
od, following first principles with pedestrian certainty. It has been 
used in most cluster studies, e.g., (7) and (12). 

The first step is to make out for each variable a separate card, 
listing thereon, in order of their occurrence in the matrix, the other 
variables which have linkages with it. These lists we may call Single 











RAYMOND CATTELL 171 


Linkage lists. Thus one might find on the first card: 

Variable A links with Variables D, G, K, R, S, V, W. 

One then takes up the D Single Linkage list card and inspects it for 
linkages of D with the variables in the above list to the right of D, 
perhaps with the following result: 

Variable D links with Variables K, S, V, W. 

One next takes up the K Single Linkage list and searches it for rela- 
tions with those to the right of K, perhaps with the following result: 

Variable K links with V only. 

A cluster of A, D, K, V has thus been established. 

The whole procedure must now be repeated for Variable B, and 
so on through the list of variables, proceeding systematically in alpha- 
betical order (or other extrinsic order. of variables in the matrix) 
until all possibilities have been exhausted. The number of operations 
—the writing down of linkages on a card—in a matrix of, say,.ten 
variables, is not great, but in a matrix of fair size, say, of seventy 
variables, it becomes enormous. If there are N variables in the ma- 
trix and the likelihood of one variable having a linkage with another 
is p(p usually being about one-tenth), one may show that the total 
number of operations, i.e., of inspection of coefficients and writing 
down of linkages, is approximately 

N Np Np Np” Np" 

N(—+— xk —-™:: ) carried out until —— equals unity.* 
2 2 2 2 
This is true because each of the N variables requires on an average (1) 





N Np 
ry coefficients to be inspected in making out the list ; (2) — direct as- 


° 
2 


sociates the lists of which have next to be inspected, and (3) “7 in- 


spections arising from each of these, and so on until no more associ- 
ates appear. Thus with 200 variables and a correlation level such that 
one r in ten is high enough to count as a linkage (i.e., p = 1/10), 
there are some 60,000 linkages to be inspected in abstracting all pos- 
sible clusters from the table. Actually the above approximate for- 
mula may underestimate rather seriously because it assumes that the 
average frequency of linkages among themselves of the variables D, 
G, K, R, etc., linked to the first variable A is no greater than if they 
had no common relative A, but there is generally a tendency for vari- 
ables related to a common variable to have greater than chance inter- 
relatedness among themselves. 


* This neglects the reduction by one variable in each operation and assumes 
an even freauency of linkage distribution. 
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B. The Matrix Diagonal Method. This method has been used 
by Burt (1), Cardall (2) in a thesis directed by Kelley, and others, 
first as a method of arranging tables for factor analysis and also as 
a method of discovering clusters per se.* As before, the experiment- 
er first marks on the correlation matrix those 7’s high enough to 
count as linkages. (He may, further, grade them, according to two 
or three sizes of the coefficients, as in Diagram 1 below.) With or 
without making out single linkage lists for the variables, he then 
manipulates the order of the variables in the matrix in an attempt to 
bring all the linkage correlations alongside the diagonal or as near 
to it as possible. If the process can be successfully carried out, the 
resultant clustering is very clearly and strikingly recorded, as in the 
illustration of Diagram 1, taken from the findings of an actual re- 
search, based on the inter-correlations of a large number of interest 


tests. 
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DIAGRAM 1 


This is a corner fragment from a correlation matrix of 60 variables, each a 
measure of interest. The only correlations considered linkages have their cells 
shaded, the rest are blank. The order of variables has been re-arranged as indi- 
cated by the numbers at the edge. 

r’s above 0.60 are indicated by solid black. 
r’s from 0.50 to 0.59 are indicated by dark shading. 
r’s from 0.40 to 0.49 are indicated by Jight shading. 


*It is also a necessary step in the search for types by inter-correlating per- 
sons instead of tests, in the well-known “Q-technique.” 
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The drawbacks of this method are: first, that it cannot be re- 
duced to any simple routine procedures; secondly, that it cannot be 
made “fool-proof” (indeed it may present a puzzle for a genius, be- 
coming complex and difficult to the point of impossibility with 60 or 
more variables) ; and, thirdly, that it is not capable of dealing at all 
with the situation, by no means rare, in which two or three variables 
enter into three or more distinct clusters. 

C. Correlation Profile Correlation (Tryon’s Method). This 
method is too well known through Tryon’s exposition (13) to need 
description here. Moreover, its theoretical implications cannot ade- 
quately be evaluated in so brief a discussion. A cluster by this defini- 
tion is a set of variables which agree (beyond a certain arbitrary 
standard) in the profile (or rank order) of their correlations with 
all remaining variables of the matrix. This amounts to saying that 
the variables of a cluster must have their columns (or rows) of corre- 
lation coefficients in the matrix positively inter-correlating, above an 
agreed figure. Thus a set of variables chosen to satisfy Spearman’s 
two-factor theory, and therefore falling into a “hierarchy,” would 
constitute a cluster (at least, the higher members would.) Variables 
which showed a group factor in addition to the general factor would 
have more highly intercorrelated, i.e., more similar, profiles than 
would others, and would constitute a more distinguished cluster. 

The relations of clusters established on this basis to those estab- 
lished on the simple inter-correlation basis employed by the great ma- 
jority of researchers using clusters has never been sufficiently ex- 
plored. Tryon’s cluster might be called a “second-order cluster,” for 
the variables are required to have similar profiles with regard to re- 
lations to other variables, instead of with regard to endowments of 
individuals in the variables concerned. There is no immediate reason 
to assume that the two concepts of cluster are identical, i.e., that the 
variables which behave in the same way to other variables will be- 
have in the same way with respect to people. Indeed it is certainly 
possible to find instances in which the variables chosen for a first- 
order cluster would not be the same as those selected by the condi- 
tions of a second-order cluster. For example, in a well-formed Spear- 
man hierarchy it is possible to select three or four variables at the 
bottom having low saturations with the general factor (and therefore 
low mutual inter-correlation) but having profiles of correlation with 
the remaining variables which have a similarity as great as or great- 
er than that existing between variables high in the hierarchy and 
highly inter-correlated (especially if the latter have some small group 
factors breaking the hierarchy). However, in the instances of suc- 
cessful use of the method by C. M. Tryon (14) and in most examples 
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tried out by the present writer, it seems true that a high degree of 
similarity of profile and high mutual inter-correlation go together. 
Assuming that this is a legitimate method of seeking ordinary clus- 
ters, we are, however, forced to conclude in the end that it does not 
contribute any short cut to cluster search methods. The matrix formed 
from inter-correlating variables by columns has as many coefficients 
as the original matrix. In a matrix of any size, therefore; it is still 
necessary to find some short process for detecting the clusters of mu- 
tually highly related variables. 

D. The Approximate Delimitation Method. This is the title giv- 
en, for lack of a briefer one, to the method invented for comparatively 
rapid, even if approximate, handling of cluster separation in large 
matrices. It was provoked by the problem of our own research (4) 
presenting a matrix with over 14,000 coefficients. Almost exactly 
10% of the r’s in the matrix reached linkage level, so that, by the 
ramifying linkage method, some 60,000 7’s would have needed to be 
examined to determine the clusters, an almost prohibitive total. 

The procedure is as follows: 

(1) Asin the ramifying linkage method a Single Linkage List 
is first prepared for each variable, showing the other variables with 
which it has 7’s high enough to count as linkages, e.g., A links with 
D, O, R,S, V, Z; B with C, D, L, N, W, Y, Z, and so on. 

(2) Each single linkage list, on a card, is systematically paired 
with each of the single linkage lists following it. That is to say, all 


N(N-1) 
possible comparisons , —— in number, are made between the N 


single linkage lists.. Whenever, in such a comparison, the two vari- 
ables compared are found to have two or more common linkages, the 
two variables are listed on a new card. (Note that this has some re- 
semblance to Tryon’s method, for such variables have, as far as the 
rough test of linkage implies it, similar profiles). Thus A and B would 
begin a card because they share D and Z. 

The new card is headed by the first variable of the pair, A, and 
accumulates under this heading all the variables which have two or 
more associates in common with the first. (Two proves a convenient 
minimum.) Of course, the associates which bind these variables to- 
gether are not necessarily the same ones for different pairings of 
variables with A. If the variables on this new card list are also linked 
directly with each other, as evidenced by their being on the same single 
linkage list, they are underlined. (B above would not be underlined.) 
The new card lists are called the Triangular Linkage Lists, for each 
underlined variable thereon is certain to belong to atleast a triad or 
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“triar.7ular linkage cluster” and almost certainly a tetrad, as Dia- 
gram 2 indicates. 

An alternative procedure at this point is to examine, for the pos- 
session of common associates with the first variable, only those vari- 
ables which fall in the single linkage list of the first variable. In this 
way one accumulates a triangular linkage list consisting only of the 
variables which would be underlined in the old triangular linkage lists. 
This offers a great saving, for perhaps only one-tenth of the paired 
comparisons mathematically possible among the variables will now 
need to be made. But at first we did not adopt this alternative, argu- 
ing that though the non-underlined variables of the longer triangular 
linkage lists fail to link (correlate highly enough) directly, they 
might eventually be found to have so many common associates, form- 
ing a cluster, that the failure of this single linkage could be over- 
looked. After all, the level of correlation accepted for entry into a 
cluster is arbitrary, and it is possible (and was indeed commonly 
found) that when one linkage only is absent an examination of the 
coefficient in the original matrix will show that it falls only negli- 
gibly below the prescribed level. It is a mistake, for this reason, to 
adopt any cluster search method that is not flexible, and some of the 
objections which might be raised against the present method on the 
grounds that it is not absolutely exhaustive should be considered in 
the light of this requirement. 

However, later experience, as indicated below, showed that, at 
least in our material, no clusters were lost through working with the 
reduced triangular linkage lists, and since that procedure involves a 
great saving of time it may prove to be the better universal method. 


A——D 

te 

B——Z 

DIAGRAM 2 
TRIANGULAR LINKAGE LIST 


If A and B are on the same list it follows (a) that they must be linked to- 
gether, and (b) that both must be linked to at least two other variables such as 
D and Z, as indicatéd. Whether D is linked with Z is not known but such a 
linkage is probable. 


(3) The triangular linkage lists are less numerous than the sin- 
gle lists; for not all variables are present even in clusters of three. 
The next step aims to bring together, as by a snowball adhesion of 
smaller fragments, whatever larger clusters can be formed from the 
triads. There is no single, clear, logically-defensible step by which 
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this can be done. Our procedure was to match in turn, systematically, 
the triangular linkage lists running into a single list those lists which 
had substantially (two-thirds) similar members. The new lists thus 
obtained we call the Approximate Cluster Lists. The process has re- 
duced their number considerably (at least 50%) from the number of 
triangular cluster lists. They are very unequal in length. Further 
they can no longer be catalogued* under the heading of any single 
variable and its connections, for they are embryo clusters, or rather 
nebulae, out of each of which one or more clusters will condense when 
the final rigid criteria are applied to precipitate them. 

(4) The last step consists in setting out the relations of vari- 
ables in each approximate cluster list graphically in a square matrix. 
One need only refer to the single linkage lists to carry out this pro- 
cess, indicating each link by a cross in a cell, though one may, alter- 
natively, decide to go back to the original matrix in order to fill in 
the cells with actual coefficients. Direct inspection will show the clus- 
ters present, though it may be necessary occasionally to rearrange 
the order of variables for greater clarity. One approximate cluster 
list commonly yields, by trimming in the matrix, a major cluster and 
some smaller ones. For example, a list of 17 variables yielded a clus- 
ter of 10, another of 6, three tetrads and eight triads. It is compara- 
tively easy to lose the independent triads, but in our research we were 
interested in and finally recorded only clusters which were tetrads or 
larger (4). 

Explained briefly and without much illustration, the above steps 
may appear somewhat complex, and their rationale somewhat obscure, 


* The problem of cataloguing clusters and embryo clusters so that the be- 
longingness of a variable, or the developing clusters themselves, can be readily 
traced, presents peculiar difficulties. Yet the whole organization of personality 
research, so far as it concerns collation, comparison and cross identification of 
clusters, either in traits or objective tests, would be facilitated a good deal by 
some efficient general system. 

In the first place, at the triangular linkage stage, the embrye clusters are 
indexed under the first variable, which is scarcely more important than any other. 
At the approximate cluster list stage, the initial variable is quite unindicative 
of the character of the list and the experimenter has to depend on his memory 
of the general composition of each mass in order to identify or refer to it. 

Even when the true cluster has finally emerged to a stable life of its own, 
to a recognized pattern and perhaps “a local habitation and a name,” a fairly 
difficult problem of indexing remains. Both embryo clusters and final clusters 
can be kept accessible only by having some rigid order, alphabetical or numerical, 
in the original variables and deriving the cluster index from this. If one then 
proceeds, in the exploration of linkages, systematically in one direction, the clus- 
ter aggregates can be indexed under the first variables in the cluster, and, next, 
under the second variable. Even so, one has difficulty in locating a cluster hav- 
ing a certain pattern of variables occurring beyond the second or third variable 
in its composition, so that eventually the only satisfactory indexing is one which 
lists opposite each variable, in a square table, all the clusters in which it partici- 
pates. 

















RAYMOND CATTELL 177 


but in actual practice they are entirely simple, provided one pro- 
ceeds always through the variables very systematically and in one 
direction. 


3. Choice of the Most Suitable Method 

For the determination of operational clusters in the simplest fash- 
ion, as the above comments indicate, the choice lies between methods 
1 and 4. The first of these, the ramifying linkage method, is entirely 
sure, but inexorably slow. The approximate delimitation method is 
excellent for detecting all clusters of appreciable size in a very large 
matrix, but it may let smaller clusters, of three or four variables, slip 
through its meshes. To make possible an appropriate choice between 
these methods it is necessary to compare them further. 

The approximate delimitations method has the advantage of be- 
ing to some extent adaptable in adjusting to time available and ob- 
jectives required. For example, if one is interested only in finding the 
very large clusters, of, say, eight components or more, it may be nec- 
essary to list, in the second process of the method, i.e., the produc- 
tion of triangular linkage lists, only those variables which resemble 
each other by having at least eight common associated items (though 
it would, of course, be safer to fix a lower limit, at, say, six items). 

Also, in most cases, it seems entirely safe to shorten process 2 
by taking only the underlined linkages, as indicated above. For the 
linkage which is excluded from one triangular linkage list because it 
does not make a perfect triad is not entirely lost: it turns up from 
another angle in another list. 

The approximate delimitations method operates by, as it were, 
roughly blocking in the cluster and its appendages, converging upon 
it from several directions, and leaving to the fourth process—that of 
drawing up a small matrix—the final chiselling out of the cluster from 
the block. The ramifying linkage method, on the other hand, picks up 
the cluster by one extremity and gradually disinters it by following 
up all the roots. 

The latter is very sure, but gives every “phenomenal cluster” 
(see below) in detail, which is not always required. A more precise 
calculation of the comparative time and labor requirements of each 
will now be made and will show the very great gain to be obtained 


from using the former method. 
2 


There are, first, ry operations* in making the single linkage 


* The same approximations will be made here as with the ramifying linkage 
method, namely, neglecting the reduction of combinations by one (i.e., N—1 is 
called N), and assuming an even distribution of frequency of linkages. 
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lists, as in the ramifying linkage method. Process 2 (making all pos- 
sible paired comparisons of these to get triangular linkage lists) re- 
N*p 





quires operations, if only underlined variables are used. If, as 


seems usual, only two-thirds of the variables yield triangular linkage 
(3N)? 


© 





lists , the next step requires operations, leading to the approxi- 


, N2 N2p 4N2 
mate cluster lists. The total is thus = + + 








. This exagger- 


ates, because in the third step of condensing into approximate cluster 
lists not all the comparisons need to be made. Some lists coalesce rela- 
tively early, reducing the number of lists in the comparison operation. 
For 200 variables there are thus about 40,000 operations, as compared 
with 60,600 for the other method. 

This comparison, however, does not do justice to the magnitude of 
the saving by the approximate delimitation method. In both methods a 
further procedure remains. The fourth process of the approximate de- 


limitation method requires the making out and examining of about iz 


small matrices. The last process of the ramifying linkage methed, on 
the other hand, requires an inter-comparison of all the clusters found; 
for, as the discussion below shows, the same individual clusters are 
picked up again and again from different angles. This process of sim- 
plification (requiring a fairly elaborate and vigilant bookkeeping) is 
a more difficult and prolonged one than the fourth process of the ap- 
proximate linkage method. 


4. The Nature of Clusters: Nuclear and Phenomenal Clusters 

Any final evaluation of the utility of different search methods 
depends upon the type of cluster one is seeking. The discussion of 
methods is therefore appropriately brought to a close by considering 
the second topic of this paper, namely, the definition of the varieties 
of correlation clusters. 

Current theoretical discussions on clusters, in personality or cog- 
nitive performances, generally proceed on the assumption that clus- 
ters are well defined, discrete entities, and of one kind only. Any 
actual exploration of correlation matrices, however, in which corre- 
lations above a certain size are represented by linkages, reveals al- 
most invariably a somewhat bewildering network of partly linked 
and more or less overlapping clusters. Fortunately, in most data, the 
overlapping is not purely haphazard or evenly distributed: there is 
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rather a tendency for it to occur about a more limited number of 
nuclei. The whole discussion of this subject can therefore be clari- 
fied if we distinguish two kinds of clusters, which, for lack of better 
terms, we may call nuclear and phenomenal clusters, and which may 
be illustrated in the diagram below. 





DIAGRAM 3 


A First Phenomenal Cluster, containing variables a, b, c, d, f, g. 
B Second Phenomenal Cluster, containing variables c, d, e, f, g, h, %. 
C Third Phenomenal Cluster, containing variables c, d, f, 9, h, 7, k. 
D Nuclear Cluster, containing variables c, d, f, g. 


This represents three phenomenal clusters, A, B, and C, within each 
of which all the variables have mutual correlations of a high level. 
They overlap in the variables c, d, f, g, but no other variables are com- 
mon to all three. Thus e correlates sufficiently with c, d, f, and g, but 
not with a and b or with j and k. The cluster c, d, f, g may therefore 
be called a core or nucleus, and the outlying portions of the phenome- 
nal clusters might be called the “appendages” of this nuclear cluster. 

The ramifying linkage method gives all three of the above phe- 
nomenal clusters separately, at widely separated points in the syste- 
matic search, requiring the experimenter to recognize the kinship or 
overlap and to superimpose the clusters in order to discover the nu- 
clear cluster. The approximate delimitation method, on the other 
hand, gives a single matrix, covering all the variables—a through k. 
Within this matrix the relationships can be at once seen, and from it, 
paring like a gem-cutter upon the rough stone, the experimenter can 
isolate the individual clusters. 

If one is aiming at nuclear clusters only, the exhaustive ramify- 
ing linkage method is a waste of time. Yet if one attempts to shorten 
it by not pursuing the method exhaustively, e.g., by following up only 
the first half of the leads, the results are misleading. Undue impor- 
tance in determining the ultimate shape of the cluster is given to 
the variables which, by the accident of alphabetical order, happen to 
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be first on the scene. Variables which fail to correlate to the specified 
amount with these first variables are dropped, with all their ramifi- 
cations unexplored, though they might in fact correlate satisfactorily 
with all later variables in the cluster. 

Whether the true goal of research is to find phenomenal clusters 
or nuclear clusters is a question to be decided by the problem and by 
circumstantial considerations which lie outside the scope of this note. 
For some personality problems it may be important to know the phe- 
nomenal clusters—to know, for example, that though variables a and 
b on the one hand and j and k on the other adhere as “appendages” 
to the same temperamental nuciear cluster c, d, f, g, they are in fact 
alternative derivatives, not appearing simultaneously in the same per- 
sonality. (They are probably opposite aspects of what Burt (1) calls 
a “bipolar factor.”) At the present stage of personality research, 
however, the mapping of appendages and conditional fringes should 
perhaps be considered secondary to the urgently necessary task of 
establishing the main nuclear clusters. Moreover, until these are con- 
firmed by several researches and delimited with some accuracy, the 
appendage outlines and the variables on the fringes ought scarcely to 
be involved in serious hypotheses, for they may represent nothing 
more than the effects of sampling errors pushing correlations now 
above, now below the accepted level for linkage (among variables on 
the borderline of admission to a cluster). Finally, in deciding be- 
tween these two methods and objectives, one must bear in mind that 
if the object of research is to reduce the number of variables with 
which further research has to deal, the ramifying linkage method, 
with its extensive harvest of phenomenal clusters, fails. Indeed in a 
majority of the researches we have studied the phenomenal clusters 
are actually somewhat more numerous than the variables from which 
they are isolated. 

Further consideration of varieties of clusters will show that there 
are not only clear phenomenal and nuclear clusters, but various de- 
grees of nuclearity, requiring, perhaps, special terms. For the pur- 
pose of our own research (4) we listed only strictly nuclear clusters 
(c, d, f, and g in the example above), expressing the chief phenome- 
nal clusters as “alternative appendages” (e.g., h, h—e—1i, a—b, and 
jk), always recorded alongside the nuclear cluster; but in other re- 
searches some record of intermediate nuclear clusters, e.g., c—d—f—g 

h, might be appropriate. 

All questions concerning (a) the indexing of clusters as depen- 
dent or independent, (b) the decision as to degrees of nuclear belong- 
ingness, and (c) agreement on the number of clusters to which the 
matrix can be reduced, are very intimately tied up with the level of 
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correlation accepted as the criterion for admission to a cluster. The 
situation is really no different from that facing the cartographer, 
when he has to decide to what land masses the terms hill and moun- 
tain are to be attached. A matrix which yields, in a certain area, two 
distinct clusters when a high level of correlation is demanded may 
yield only one large cluster when the level is lowered, as two islands 
become one when the tidal level falls. (See Diagram 4 below, in 
which this relativity of clusters is rendered particularly obvious 
through the spatial expression of correlations as cosines.) 


5. Clusters or Factors? 

It seems to be the contention of those recent researches which 
have preferred cluster analysis to factor analysis that while clusters 
reduce the number of variables practically as effectively as factors, 
they enjoy a greater reality than factors. We shall debate this; but 
before doing so we shall admit one very real advantage to clusters, 
namely, that they permit the results of different researches to be rela- 
tively easily combined. When the results of several different cluster 
researches, on the same variables and similar populations, are col- 
lated, it is possible to see at once where the results of one research 
are approximately confirmed by those of another, for no problem of 
rotation or of factor system arises. And when partly different sets 
of variables are used in successive researches it is possible to augment 
the description of clusters arrived at in the first research by adding 
those variables in the second which belong to recognizably similar 
clusters. (Those accumulated clusters will need later confirmation, 
and proof that no variables stray outside the accepted angle.) Since 
few researches use exactly the same sets of variables, this possibility 
of cumulative knowledge through overlapping researches is very at- 
tractive. With factor analysis, on the other hand, where the factori- 
zation varies with the mathematical system adopted and with the 
trait population, and where experimenters may not use the same ro- 
tation and orientation of axes, the utilization in a single integrated 
conclusion of results from different researches (without going back 
to the original correlations) is often practically impossible. 

But the notion that the cluster enjoys an absolute reality and 
stability cannot be left undisputed. In the first place, the member- 
ship and shape of a cluster is affected as much as, and commonly more 
than, a factor, by sampling errors of population and by varying reli- 
abilities of the tests. Secondly, even when freed of such errors, clus- 
ters have intrinsic indeterminacy. Consider the situation in Diagram 
3 below, in which correlations are represented geometrically (for 
simplicity in two dimensions). In a research using variables a 
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through & and a cluster criterion of a minimum correlation of 0.75 
there are three clusters, A, B, and C. If the criterion is lowered to 
0.40 there are two clusters, A’ and B’, overlapping. If, with the first 
criterion, the variables are increased by adding / and m, correlating 
in the manner shown, there appears a series of continuously over- 
lapping clusters from a to k, such that only by some further arbitrary 
step is it possible to determine what clusters shall be said to exist. In 
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DIAGRAM 4 


some situations, incidentally the arbitrariness of cluster demarcation 
may be avoided by borrowing from factor analysis Holzinger’s B-co- 
efficient technique, so that a cluster is defined by a density of group- 
ing of the variable lines (in Diagram 3) relative to the general den- 
sity in that region (8). This method could, in fact, be substituted 
anywhere for the rigid, minimum-correlation criterion adhered to in 
this article, so that clusters might have widely different degrees of 
internal inter-correlation, depending on the surrounding density in 
their field. But it would not solve the present problem. In short, a 
cluster may become quite as arbitrary and quite as indeterminable 
by the nature of the given correlations alone, as any factor. Fortu- 
nately clusters are not so likely to run together in the above confusing 
fashion in n-dimensional space as in a two-dimensional diagram, and 
with a fairly high criterion of cluster belongingness, e.g., a minimum 
r of 0.6, problems of separating clusters seldom arise. 
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Elsewhere (3) the present writer has put forward reasons for 
believing that a psychological functional unity, i.e., an entity real in 
some psychological causal sense, is more likely to be represented as a 
factor than as a ciuster in observed covariation relationships. A clus- 
ter, it is true, may arise from a factor, in the sense of being consti- 
tuted by those traits which are highly loaded by that factor; but a 
cluster is at least as likely to represent instead a set of variables which 
share the cumulative overlap of several factors, not one of which is 
highly represented in them. If the regions( sets of variables) of over- 
lap of factors are affected by locai and circumstantial conditions 
whereas the factors represent real influences, it follows that many 
clusters will be transient, unstable phenomena, while factors will be 
more constant in appearance and in meaning. For example, general 
intelligence may be a factor and the pattern of a classical education 
may be a factor, but the cluster which appears as a high correlation 
of intelligence items and knowledge of classics items, and which guided 
many scholastic and civil service appointments in the last century, 
may disappear with the disappearance of the practice of directing the 
most intelligent students into classical studies. 

The maintenance of cluster analysis and factor analysis in their 
true roles and relationships may perhaps be best assured by the brief 
dictum that clusters are essentially representations at the descriptive 
level, and as such are little better than straight statements of the 
correlation coefficients, whereas factors are statements at the inter- 
pretive level. If the interpretations are correct the factors have more 
permanent value and far wider utility. But until there is more gen- 
eral agreement on the rotation of axes it may be desirable to publish 
analyses of correlation matrices both in cluster analysis and in factor 
analysis form; for the former will preserve results in a shape suitable 
for immediate collation with those of other researches. Secondly, as 
in the research from which these observations arise, (4, 5) the cluster 
analysis may be used as a first reduction of variables, to provide a ~ 
briefer list upon which a factor analysis can be more practicably car- 
ried out than on the original plenary list. Naturally this factor analy- 
sis at one remove will not yield all the factors required to account for 
the variance of all variables,of the original plenary list, but it will 
provide the more important ones for the major structuring of the 
field, i.e., a broad framework within which the findings of later, more 
restricted and local factor analyses can be fitted in perspective. 
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FUNDAMENTAL FACTORS OF COMPREHENSION 
IN READING 


FREDERICK B. DAVIS 
COOPERATIVE TEST SERVICE OF THE AMERICAN COUNCIL ON EDUCATION* 


A survey of the literature was made to determine the skills in- 
volved in reading comprehension that are deemed most important by 
authorities. Multiple-choice test items were constructed to measure 
each of nine skills thus identified as basic. The intercorrelations of 
the nine skill scores were factored, each skill being weighted in the 
initial matrix roughly in proportion to its importance in reading 
comprehension, as judged by authorities. The principal components 
were rather readily interpretable in terms of the initial variables. In- 
dividual scores in components I and II are sufficiently reliable to war- 
rant their use for practical purposes, and useful measures of other 
components could be provided by constructing the required number 
of additional items. The results also indicate need for workbooks 
to aid in improving students’ use of basic reading skills. The study 
provides more detailed information regarding the skills measured by 
the Cooperative Reading Comprehension Tests than has heretofore 
been provided regarding the skills actuaily measured by any other 
widely used reading test. Statistical techniques for estimating the 
reliability coefficients of individual scores in principal-axes compo- 
nents, for determining whether component variances are greater 
than would be yielded by chance, and for calculating the significance 
pad oe differences between successive component variances are illus- 

rated. 


The application of techniques of factorial analysis to the inves- 
tigation of reading has been attempted several times. Feder (11), 
Gans (12), and Langsam (23) have published studies that employed 
Thurstone’s centroid method, and unpublished studies have been made 
by Bedell and Pankaskie. So far as the writer is aware, the study re- 
ported here is the first to make use of tests especially constructed to 
measure the mental skills in reading comprehension that are consid- 
ered of greatest importance by authorities in the field.** 

The most important step in a study that employs factorial pro- 
cedures for the investigation of reading comprehension is the selec- 
tion of the tests the scores of which are to be factored. Unless these 
tests provide reasonably valid measures of the most important mental 
skills that have to be performed during the process of reading, the 
application of the most rigorous statistical procedures can not yield 
meaningful and significant results. The importance of this point can 
hardly be overstated. 


* On leave for military service. 
** For a detailed presentation of the basic data of this study, see (8). 
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As the first step in the present study, a careful survey was made 
of the literature to identify the comprehension skills that are deemed 
most important by authorities in the field of reading. A list of several 
hundred specific skills was compiled, many of them overlapping. This 
list of skills was studied intensively by the writer in order to group 
together those that seemed to require the exercise of the same, or 
closely related, mental skills. The objective was to obtain several 
groups of skills, each one of which would constitute a cluster having 
relatively high intercorrelations and relatively low correlations with 
other clusters of skills. 

Nine groups of skills were sorted out and labeled. For the pur- 
poses of this study, they are regarded as the nine skills basic to com- 
prehension in reading. Included within them is the multitude of spe- 
cific skills considered important by the authorities consulted. These 
nine basic skills are as follows: 


1 Knowledge of word meanings 

2 Ability to select the appropriate meaning for a word or phrase 
in the light of its particular contextual setting 

3 Ability to follow the organization of a passage and to identify 
antecedents and references in it 

4 Ability to select the main thought of a passage 

Ability to answer questions that are specifically answered in 

a passage 

6 Ability to answer questions that are answered in a passage 
but not 1n the words in which the question is asked 

7 Ability to draw inferences from a passage about its contents 

& Ability to recognize the literary devices used in a passage and 
to determine its tone and mood 

9 Ability to determine a writer’s purpose, intent, and point of 
view, i.c., to draw inferences about a writer 


or 


To provide a measure of each one of these nine basic skills, a 
large number of five-choice objective test items were constructed. All 
possible care was taken to obtain items that measured only one rather 
than several of the nine skills. However, it was recognized that skili 
1 (knowledge of word meanings) is basic to the measurement of all 
the other skills, since to read at all one has to recognize words and 
understand their meanings, and that some overlapping of skills 2-9 
is inevitable. 

Since the final forms of the reading-comprehension tests used in 
this study were to be the published forms of Tests C1 and C2 of Form 
Q of the Cooperative Reading Comprehension Tests, practical consid- 
erations [notably the requirements of the procedure for obtaining 
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three equivalent “scales” in the tests (6) ] determined’ in some meas- 
ure the number of items testing each basic skill that could be used. An 
effort was made, however, to include the proportion of items testing 
each one of skills 2-9 that conformed to the judgments of authorities 
in the field of reading. 

To obtain the intercorrelations of scores in the nine basic reading 
skills selected for measurement, 240 multiple-choice items were admin- 
istered to a large number of freshmen in several teachers colleges.* 
The students were told to mark every item and were allowed an un- 
limited amount of time. By this means, the influence of speed of read- 
ing was removed and the effects of mechanical difficulties in word 
perception were minimized. Of the 541 students tested, 421 actually 
answered every item, and, when proof was obtained that this group 
constituted a representative sample of the entire 541 students tested, 
the scores of only these 421 pupils were used in the factorial analysis. 
In addition to the intercorrelations of the scores, the correlations be- 
tween sex and scores in each of the nine skills were computed. As 
would ‘have been expected, the correlations with sex were all insig- 
nificantly different from zero. This being so, there was no need to 
partial out the influence of sex before making a factorial analysis. 

Table 1 shows the intercorrelations of the scores in the nine basic 
reading skills, and their relationships with sex. 


TABLE 1 


Intercorrelations, Means, and Standard Deviations of Raw Scores in the 
Nine Basic Reading Skills, and Their Relationships with Sex 
(N = 421) 








Skil 1 2 ‘8 4 #5 6° 7 %& 9 Sex*\Mean § ¢ 


72 41 28 62 .71 ..68 51 68 08 28.77 11.61 
34 «8 538 .71 68 52 .68 -—07. 1270 8.25 
16 384 48 42 28 «4.41 -.01 4.20 1.73 

380 86 85 29 86 -.03'' 2.97 1.10 

64 55 45 .55 -04 18.10 2.46 

76 6.57) = 6.76 «01 =. 25.67 5.67 

59 68 .06 28.46 5.81 

58 -.05 6.75 1.86 

-.05. 15.19 4.07 
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*A pcsitive coefficient in this column indicates that the men obtained a higher mean score than 
the women. 


* Every freshman in all of the teachers colleges of the State of Connecticut 
and every freshman in two of the Massachusetts State Teachers Colleges com- 
prised the sample tested. The testing was done about a month after the begin- 
ning of the schoo] year. 
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The intercorrelations of the nine basic skills range from .16 to 
.76, the values reflecting in part their true relationships and in part 
the differences in their reliability. The reliability coefficients of the 
scores in the nine skills are shown in Table 2. 


TABLE 2 
Reliability Coefficients of Raw Scores in Each of the Nine Basic Reading Skills* 











Skill a1 N Number of 
Items 

1 90 100 60 
2 56 100 2 

3 a4 100 9 
4 18 421 5 
5 55 100 22 
6 77 100 42 
7 63 100 43 
8 64 100 10 
9 71 100 27 








* The division of each test into two halves was accomplished in this case by arranging the 
items in order of difficulty and assigning alternate items to each half. It will be recalled that 
speed had no influence on these scores. The reliability coefficient for skill 4 is based on 421 cases; 
the reliability coefficients for the other skills are based on a representative sample of 100 cases 
drawn from the 421 available. 


As would be expected in view of the widely different lengths of 
the tests used to measure the nine basic reading skills, their reliabil- 
ity coefficients differ considerably. For even the least reliable, how- 
ever, the reliability coefficient is substantially and significantly great- 
er than zero. 

Subjective judgment had forecast relatively high correlations be- 
tween skill 1 and each of skills 2-9. Inspection of Table 1 in the light 
of the data in Table 2 reveals this to be so. It is apparent that skill 1 
constitutes the largest element common to all of the other initial vari- 
ables; hence, it may be of interest to study the intercorrelations of 
skills 2-9 when skill 1 is held constant. These partial coefficients are 
shown in Table 3. 











TABLE 3 
Partial Correlation Coefficients Among Skills 2-9, Skill 1 Being Held Constant 

(N = 421) 

Skill 3 4 5 6 7 8 9 

2 .09 .23 .26 40 28 .26 OT 

3 05 16 22 22 .09 .20 

4 19 23 22 ay .24 

5 45 32 .26 32 

6 53 33 53 

7 3) 40 

8 38 
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Perhaps the most surprising feature of the data in Table 3 is the 
small size of the coefficients. After making due allowance for the at- 
tenuation resulting from the comparatively low reliability coefficients 
of some of the variables, it is apparent that reading comprehension, 
as measured by the nine basic reading skills, is not a unitary ability. 
From the correlations it appears probable that a mental ability 
present to the greatest extent in skills 6, 7, and 9 is second most im- 
portant in producing the intercorrelations shown in Table 1. To ex- 
plore this matter, a factorial analysis was undertaken, using the 
method described by T. L. Kelley (22).* 

The initial matrix of variances and covariances used in the fac- 
torial analysis is presented in Table 4. 











TABLE 4 
Initial Matrix of Variances and Covariances* 

Variable oa Ly Xs x, & Le L, XL, Ly 
%, 134.70 27.01 816 3.65 14.77 4688 45.78 11.04 32.07 
a 10.56 1.94 1.29 4.22 18.08 12.90 3.17 8.93 
Xe 3.01 0.81 1.44 4.24 4.24 0.90 2.91 
x, 1:22 0.82 2.25 2.25 0.59 1.63 
es 6.05 8.93 7.85 2.07 5.538 
Le 32.17 24.89 5.96 17.42 
Ly 33.75 6.33 16.00 
te 846 4.42 
x 16.54 


9 








* Variances are shown in the diagonal cells. The Kelley method would be equally applicable 
if the scores in variables 1-9 were transformed into standard measures. In this case, the variance 
in each diagonal cell would be 1 and the covariances would be identical with the intercorrelations 
shown in Table 1. The resulting matrix would undoubtedly present a more familiar appearance to 
many students. Each one of the basic reading skills would then have been weighted equally for 
purposes of factorial analysis. However, authorities in the field of reading quite reasonably do not 
judge each one of the basic skills to be of equal importance in the process of reading comprehen- 
sion. Of the many possible factorial analyses (using different weights), that analysis which ap- 
pears to have unique merit is a principal-axes solution based on a matrix of variances and cov: 
ances in which the initial test variances are weighted to correspond with their relative importance 
in the process of reading, as determined by the pooled judgment of authorities. That is the type 
of factorial analysis that it was intended should be performed in the present study, but practi 
considerations resulted in some modifications in the relative weights of the nine initial variables. 

For purposes of comparison, the Kelley method was used to perform a factorial analysis of 
the correlation matrix shown in Table 1 (excluding sex) with unit variances.in the diagonals. A 
comparison of the factor loadings derived from the two eae analyses and from a cen- 
troid analysis of the same data is now in preparation. 


In Table 5 are presented the coefficients of each of the initial 
variables (the nine basic reading skills) that yield the nine indepen- 
dent components obtained by factorial analysis. The design shown 
in Table 5 is one of the most interesting that has been obtained by 
factorial techniques. 

* For this study it was desirable to obtain the factor loadings of all signifi- 
cant components rather than the loadings for only the two or three largest com- 


ponents; hence a fairly large number of subjects was tested and Kelley’s method 
was selected as being most suitable for use. 
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TABLE 5 


Coefficients of Each of the Initial Variables That Yield Scores in the 
Nine Independent Components 


(Factor Loadings of Skills 1-9 in Components I-IX) 











Components I II ii 6€61V V MA «(Vl “VE 1% 
Variance 192.270 22.824 8.657 5.282 3.828 3.306 2.327 1.956 1.006 
Skills Variance 
1 813 -.571 -.064 -.033 -.082 .006 -.016 .001 .011 134.699 
2 184 .124 -.005 -.003 .971 -.019 -.017 -.028 -.076 10.563 
3 057 .054 -.001 .000 -.000 .000 .997 .000 -.004 3.009 
4 027 .048 -.000 .000 .067 .000 .000 .000 .996 1.220 
5 107 .149 .152 -.003 -.022 .970 -.014 -.024 -.012 6.050 
6 341 .469 .567 -.531 -.129 -.204 -.044 -.001 -.023 82.169 
f 2386 §=..580 -.719 .008 -.147 -.020 -.051 -.091 -.028 83.752 
8 078 .105 -.001 .141 -.000 .000 -—.010 .981 -.007 3.456 
9 233 = .258 366 = .835 -.080 -.126 -.027 -.166 -.013 16.540 








The subjective judgment exercised in constructing the tests of 
the nine reading skills is reflected in the surprising extent to which 
several of the tests appear to be moderately “pure” factor measures. 
A word of caution must, however, be injected. Because some of the 
skills were judged to be more important than others in the reading 
process and because practical considerations governed to some extent 
the number of items used to measure each of the nine reading skills, 
the standard deviations of the initial variables differed considerably. 
And, since the initial matrix of variances and covariances used for 
the analysis reflected those differences, the coefficients in Table 5 must 
be interpreted with due regard for the magnitudes of the standard 
deviations of the nine initial skills. Scores in skill 1, for example, 
have a large standard deviation in comparison with the standard devi- 
ations of scores in the other skills. So a small component loading in 
skill 1 may be found te have more weight in a regression equation for 
obtaining scores in any one of the components than would be expected 
from an inspection of Table 5 alone.* 


* Readers who are most familiar with the centroid method of factorial analy- 
sis have sometimes questioned this statement. A principal-axes analysis makes it 
possible to obtain very readily a given individual’s score in any one of the com- 
ponents for which regression coefficients (or factor loadings) have been deter- 
mined. For example, individual scores in component I may be obtained from the 
following regression equation: 


C,= .813(X,) + .184(X,) + .057(X,) + .027(X,) + .107(X;) 
+ .341(X,) + .886(X_,) + .078(X,) + .233(X,). 
In this equation, variables 6 and 7 have nearly identical regression coeffici- 
ents, but we know that the standard deviation of variable 6 is 5.67 while that of 


variable 7 is 5.81. Therefore, variable 7 will have a slightly greater weight in 
determining an individual’s score in component I than will variable 6 despite the 
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A study of the values in Table 5 (making due allowance for the 
magnitudes of the standard deviations of the initial variables) re- 
veals that the nine components are rather readily identifiable in terms 
of the original nine reading skills. Component I is clearly word know]l- 
edge (skill 1). Its positive loadings in each of the nine basic reading 
skills reflect the fact that to read at all it is necessary to recognize 
words and to recall their meanings. 

It is clear that word knowledge plays a very important part in 
reading comprehension and that any program of remedial teaching 
designed to improve the ability of students to understand what they 
read must include provision for vocabulary building. When one com- 
bines the evidence that word knowledge is so important an element in 
reading with the fact that the development of an individual’s vocabu- 
lary is in large measure dependent on his interests and his background 
of experience, the relatively low correlations between reading tests in 
different subject-matter fields are understandable.* There is, how- 
ever, no necessity to conclude that all of the fundamental] factors of 
comprehension in reading are not involved in reading materials in 
various subject-matter fields. 

Component II has been termed a measure of reasoning in read- 
ing. It has its highest positive loadings in the two reading skills that 
demand ability to infer meanings and to weave together several state- 
ments. It may seem puzzling at first that this component should have 
a strong negative loading in skill 1 (word knowledge), but considera- 
tion of the psychological meaning of components I and II indicates 
that this should be expected. The explanation undoubtedly lies in the 
fact that individuals who know accurately the meanings of a great 
many words are thereby given a head start toward getting the mean- 
ing of what they read. Therefore, if we are to measure reasoning in 
reading independently of word knowledge, we must give individuals 
who are deficient in word knowledge a “handicap” and then see how 
well they reason when they are placed on equal terms with their fel- 
lows in word knowledge. Component II apparently measures the abil- 
ity to see the relationships of ideas. 


* For data on this point see (5). 





fact that the factor loadings of variables 6 and 7 in component I are almost the 
same. 
A simple and convenient aid in interpreting the regression coefficients with 
proper regard for the sizes of the standard deviations of the initia] variables is 
to construct a table containing each regression coefficient multiplied by the ap- 
propriate standard deviation of an initial variable. For example, the factor load- 
ing of skill 1 in component I (.818) would be multiplied by the standard devia- 
tion of skill 1 (11.61), yielding 9.4; the factor loading of skill 2 in component I 
eras ook be multiplied by the standard deviation of skill 2 (3.25), yielding 
.6; and so on. 
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Component III is not so readily interpretable as most of the oth- 
ers, but it is clear that individuals who obtain high scores in this com- 
ponent focus their attention on a writer’s explicit statements almost 
to the exclusion of their implications. Component IV measures chiefly 
the ability to identify a writer’s intent, purpose, or point of view 
(skill 9). Individuals who obtain high scores in this component are 
less concerned with what a writer says than with why he says it. 
Such individuals should presumably be better able to detect bias and 
propaganda than individuals who obtain low scores in this component. 
Component V is composed principally of ability to figure out from the 
context the meaning of an unfamiliar word or to determine which one 
of several known meanings of a word is most appropriate in its par- 
ticular contextual setting (skill 2). It is reasonable that it should be 
essentially unrelated to skill 1, which measures memory for isolated 
word meanings. The slight negative loadings of skills 6 and 7 in com- 
ponent V may result from the fact that the latter measures deductive 
reasoning, while skills 6 and 7 measure inductive processes. 

Judging by its very high loading in skill 5, component VI seems 
to be largely a measure of ability to grasp the detailed statements in 
a passage. It is probably a fairly direct measure of the ability to get 
what I. A. Richards has called “the literal sense meaning” of a pas- 
sage. Skill 5 was originally intended to measure this ability and the 
results of the analysis suggest that this ability is more than a name; 
it appears to be a real psychological entity distinct from other men- 
tal skills involved in reading. Component VII seems to be a measure 
principally of skill 3 (ability to follow the organization of a passage 
and to identify antecedents and references in it). The variance of 
this component consists of about 77% of the original variance of 
skill 3. 

Component VIII measures specific knowledge of literary devices 
and techniques, and probably reflects the influence of training in Eng- 
lish more than the other components do. Component IX is composed 
largely of ability to select the main thought of a passage; it may be 
considered a measure of ability in the synthesis of meaning. The vari- 
ance of component IX comprises approximately 82% of the original 
variance of skill 4 (ability to select the main thought of a passage). 
Students who make high scores in component IX are presumably 
those who would be most capable of writing adequate summaries and 
précis of what they read. 

Of the nine components described, all except components II, III, 
and IV can, for practical purposes, probably be measured satisfac- 

torily by means of raw scores in one of the nine basic reading skills 
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selected initially. Components V through IX account for only a small 
fraction of the total variance, but their variances are significantly 
different.* A number of the skills considered most important by au- 
thorities in the field of reading include independent elements that 
should be taught separately. It is not enough to assign learning ex- 
ercises in reading that consist of passages followed by factual ques- 
tions to be answered. Such exercises will not necessarily call the stu- 
dent’s attention to the separate and essentially unrelated reading 
skills that he ought to master or give him sufficient practice in each 
one of them. 











TABLE 6 
Variance Ratios of Successive Components 
Component Degrees of Freedom Variance F 
I 406 192.270 
8.280 
II 899 22.824 ) 
2.663 
III 403 8.657 
| 1.622 
IV 399 5.282 3 
| 1.387 
V 401 3.828 | 
| 1.158 
VI 401 3.306 
1.428 
VII 403 2327 
| 1.181 
VIII 400 1.956 ! 
| 1.944 
IX 400 1.006 | 








* The writer is indebted to Professor T. L. Kelley for the development of a 
precise test of the variance ratios of components obtained by his iterative process. 
This test is described in the article by Professor Kelley that immediately follows. 

The differences between the variances of successive components are all sig- 
nificant at the one-per-cent level with the exception of the differences between the 
variances of components V and VI, and VII and VIII; those differences are sig- 
nificant approximately at the five-per-cent level. 

It should be noted that the variance-ratio test of the significance of the dif- 
ference between component variances is permitted by the Kelley method but is 
not permitted by other methods of factorial analysis that are frequently employed. 

Whether the variance of component IX (the smallest component) is signi- 
ficantly greater than would be yielded by chance may be determined by noting 
whether the reliability coefficient of component IX is significantly greater than 
zero. This is not established by the data. It is highly likely, however, that the 
variance of the next largest component is significantly greater than would be 
yielded by chance. 
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TABLE 7 


Reliability Coefficients, Means, and Standard Deviations of the Six Independent 
Components Having Reliability Coefficients Substantialiy Greater Than Zero 











Component Ti Mean Standard Deviation 
I 94 46.30 13.87 
II 48 24.14 4.78 
III .28 81 2.94 
IV 17 - .62 2.30 
VII 33 27 1.53 
VIII .29 .70 1.40 








Because individual scores in each of the independent components 
defined above can readily be estimated by using appropriate regres- 
sion equations (Cf. ante, footnote following Table 5), the reliability 
coefficients of scores in the nine components have been determined 
empirically, using the same sample of 100 cases for which odd and 
even scores in each variable were obtained in computing the reliabil- 
ity coefficients of the nine initial variables. 

Inspection of Table 7 reveals that only components 1] and II are 
measured with sufficient reliability to warrant their use for practical 
purposes. However, when the significance of the reliability coefficient 
of each one of the nine components is tested,* it becomes evident that 
useful measures of at least three additional components could cer- 
tainly be provided by constructing the required number of additionai 
items of the appropriate types. Since several of the components may 
be satisfactorily measured, for practical purposes, by raw scores in 
appropriate types of test items, construction of a large number of the 
indicated types of items has already been started. It is believed that 
these may be useful for instructional as well as for measurement pur- 
poses when they are employed in combination with other workbook 
materials. 

Since useful measures of components I and II are already avail- 
able, a profile chart for making a graphic record of scores in these 
two components has been prepared and is described in considerable 
detail elsewhere (9). 

The correlations of components | and II with the Q and L scores 
derived from the American Council on Education Psychological Ex- 


* The standard error of a split-half reliability coeflicient, corrected by the 
Spearman-Brown formula, may be obtained by using Shen’s formula, 
2(1 — 1,,) 


11 V N 


r 
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amination and with the total score on the Nelson-Denny Reading 
Test have also been reported in the literature (9, 370-371). It is 
hoped that the relationships between components I and II and other 
well-known reading tests can be obtained, for if components I and 
II are regarded as fundamental abilities in reading it is of para- 
mount importance to determine the extent to which the reading tests 
now commonly used in high schools and colleges actually measure 
each of these abilities. 

The study reported here has explored one means of investigating 
the psychological nature of reading ability. It has suggested a means 
of determining the validity of tests of comprehension in reading. The 
results indicate that there is need for reliable tests to measure several 
of the nine basic skills that have been defined and for workbooks to 
aid in improving students’ abilities in them. The need for correlating 
scores in existing reading tests with scores in several of the principal 
components seems obvious. And, not least, the study provides more 
detailed information regarding the skills measured by the Coopera- 
tive Reading Comprehension Tests than has heretofore been provided 
regarding the skills actually measured by any other widely used read- 
ing test.* 

Finally, it is hoped that the data presented will draw attention to 
the importance of the mental skills involved in reading and act as a 
stimulus to further research in the fundamental factors of compre- 


hension. 
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A VARIANCE-RATIO TEST OF THE UNIQUENESS OF PRINCI- 
PAL-AXIS COMPONENTS AS THEY EXIST AT ANY 
STAGE OF THE KELLEY ITERATIVE PROCESS 
FOR THEIR DETERMINATION 


TRUMAN L. KELLEY 
HARVARD UNIVERSITY 


The immediately preceding article by Dr. Frederick B. Davis 
provides illustration of the method here given. Let the initial vari- 
ables be: 

UM » Lo y°**y Lm y 
with the degrees of freedom in each: N-1,N-1,---,NM—-1. 


Make a first rotation between x, and x., obtaining y, and y.. The 
equation y, = 2, cos 6 + 2 sin 6 constitutes a linear restriction, and, 
in addition to this, Sy, = 0. Similarly for y., so we now have vari- 
ables y:, Y2, %3, °** » tm, having the following degrees of freedom: 
N-2,N-2,N-1,---,N-—1. Also, at this point y, and y- are inde- 
pendent, so a variance-ratio test is appropriate. The variance ratio is 


V,,/ (N-2) 
F -2,N-2 a. 
eV, / (N-2) 
If the next rotation is between y, and x;, we would have 
variables: Sis es; 3 » Leyttty Lm,» 
d.o.f.: N-38, N-2, N-2, N-1,::-,N-1. 
The variance ratio , 
V./ (N-38) 
F y-s,y-2 = —————_— 
we Vay 2) 


provides a precise test for the difference between these two variances. 
This last rotation may have introduced a very slight covariance, Cz,y,, 
but hardly such as to vitiate a variance-ratio test between V., and V,,, 
for the correlation between z, and y. is very low, and between the 
variances still lower, the correlation between variances being closely 
equal to the square of the correlation between variables. 

When the iterative process is continued until all covariance in 
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excess of the order of covariance yielded by chance, considering the 
size of the sample, is eliminated, the resulting variables are, within 
the limits of chance, the final principal-axes components, C,, C.,---, 
C,,, and a variance-ratio test between the variance of any component 
and any other component is available. It is 


Vo,/(N-1-9) 





F'y-1-9,y-1-4 = , 

N-1-9,N-1-h Vc,/(N—1—h) 
in which g is the number of rotations involved in reaching the vari- 
able that is Component a, and h the number connected with the vari- 
able that is Component b. 
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CONTRIBUTIONS TO THE MATHEMATICAL THEORY OF 
HUMAN RELATIONS VIII: SIZE DISTRIBUTION 
OF CITIES 


N. RASHEVSKY 
THE UNIVERSITY OF CHICAGO 


An attempt is made to connect the distribution function of the 
sizes of the cities with the distribution functions of some other char- 
acteristics of the individuals in the society. Several theoretical pos- 
sibilities are discussed and different relations are derived. A possible 
connection with some observed relations is discussed. 


In a previous paper (13) we have outlined a theory of size dis- 
tribution of cities, which represented a generalization of our previ- 
ous theory of urbanization (4, 10). If N(7)dn denotes the total num- 
ber of individuals that inhabit all cities, the population of which is 
between » and n + dn, then N(n) is determined essentially by the 
function f(n , N), which represents the per capita production of goods 
in the cities of population 7, N being the total population. The fac- 
tors considered in that theory are essentially economic ones. 

We shall now consider a different approach to the same prob- 
lem. Before we do that, however, we shall briefly discuss the possible 
relation of the previous theory to available data. 

G. K. Zipf (14) has found an interesting relation between the 
sizes of cities and their rank according to size. If we rank all cities 
consecutively in order of decreasing size and denote the population of 
a city of rank r by n(r), then, according to Zipf, for many countries | 
we have the relation 


C 
ee (1) 


where C is a constant. By considering data for the United States and 
for Canada at different times, Zipf finds that relation (1) did not 
hold in the past, but has been gradually approached. He generalizes 
this by postulating that relation (1) is characteristic of a stable so- 
ciety, and is reached as a society passes from less stable to more stable 
configurations. In our opinion, such a postulate seems to lack any 
either empirical or theoretical evidence. It would therefore be of 
great interest if such a postulate should be contained as a deduction 
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from a rational theory. This would throw light on the mechanism 
underlying the simple relation (1). 

Inasmuch as the distribution functions studied by us previously 
do not involve the rank-order of the cities, we shall first investigate 
how Zipf’s results translate themselves into our notations. 

Denoting by r(v) the rank of a city having a population n, 
equation (1) may also be written 


r(n) —_, (2) 
n 


Let N(n)6n be the number of cities whose population lies be- 
tween n and n + 6n, én being a Small but finite quantity. Consider 
instead of (2) any arbitrary function r(”) which must, however, be 
decreasing with increasing.n. For very large values of r(n), the 
number of cities N (n) én will be large even for small én. Since r(n) 
increases discontinuously and always in steps of one, therefore 


N(n)6n equals — 6r(”), which corresponds to én, for a given n. 





N(n) 6n=-— or(n), (3) 
or 
N(n) =— dr (2) (A\ 
én 


For very large r’s, we can take very small 6”, and in the limit 
shall then find 
dr 


an’ (5) 


N(n) = — 
The notion of a continuous distribution function N(n) breaks down 
for very large values of x, for there are only a few very large cities 
in each country. Nevertheless, far away from the tail end of the 
curve, the function N(n) may be practically determined. The rank- 
order notation has the advantage of covering the whole range of sizes. 
The function N(n) is connected with the function N(n) by the 
relation 





ienm. (6) 
n 
Hence, if r(n) = C/n, then 
~ C C 
Tay =—; NUR) He — FH). (7) 


n n 
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Denote by N the total population, 
n= f nN (n)dn= [oNoan. (8) 
0 © 


A distribution function N(n) such as given by equation (7) 
could be readily obtained from our previous theory by putting in (13) 


f(n,N) =AnN , (9) 


where A is a constant. That would give us, according to equation 

(17) of (13), 
Dp 

N(n) rm (10) 

p being determined as before from relation (8) of this paper. 

Equation (9) means that the per capita production of goods is 
proportional to both n and N, a relation that does not seem to be 
very plausible. Because of relation (6) this would imply that the 
total production is proportional to nN? . 

The simple theory developed before does not seem, therefore, to 
account for the relation found by Zipf. Inasmuch as not all countries 
follow Zipf’s relation, and inasmuch as the above-mentioned theory 
may be modified and generalized, we should not discard it yet alto- 
gether. In a complex problem like this in which many factors prob- 
ably enter, it may be advisable to discuss in abstracto different con- 
ceivable theoretical cases, without first worrying about actual data. 
A thorough classification of the theoretical possibilities may later on 
prove a help in deciding which of those possibilities or their combi- 
nations can actually be applied to observations. It is in this spirit of 
an abstract theoretical study that we shall discuss an alternative ap- 
proach to the problem without prejudice to other possible approaches. 

It is natural to attempt to connect the formation of cities with 
the presence of active groups in a society. A city may originate as 
an administrative center, in which case its formation will be closely 
related with the activity of an administrative active group, which we 
shall denote group J, as in previous papers (3-10). A city may also 
originate as a trade or industrial center, in which case it will be asso- 
ciated with another active group, which we previously denoted as 
group IJ. The stronger the administrative group J, the larger we 
may expect the principal city to be. On the contrary, a strong indus- 
trial group will result in the formation of large industrial cities. It 
is therefore natural to inquire whether the distribution function of 
city sizes may not be connected with the distribution function of the 
gradation of different types of activities within the population. In 
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our earlier papers we began by considering continuous distribution 
functions for such quantities as coefficients of influence, etc., (1, 2). 
Subsequently, for mathematical convenience and as a practical ap- 
proximation, we considered discontinuous distributions, assuming the 
population to be divided into completely active and completely pas- 
sive ones without gradual transition. While such an assumption cer- 
tainly does not correspond to reality, it serves as a good approxima- 
tion. In some cases the results obtained by considering continuous 
distribution are found to be essentially the same as for the approxi- 
mate discontinuous case (9). 

Since this is a purely theoretical study, we shall not specify here 
what particular characteristics we do consider. We shall simply de- 
note that characteristic by x and consider that in a population of N 
individuals the characteristic x is distributed according to some func- 
tion N(x). In whatever physical or psychophysical units we measure 
that characteristic x , we shall choose our units so that the maximum 
value of x in a given group is equal to 1. We then have 


hye [ Neaz. (11) 


We have seen previously that such a group may break up into 
several smaller groups, if each individual associates only with those 
individuals whose x is not too remote from his own. Equations for 
determination of the size of those classes have been given previously 
(1, 2). We thus find that the whole population is divided into n + 1 
groups, whose x’s lie between 1 — x,, 2%, — Yo, X2 — X3,°** %n — O, 
with x, > 2%.,. The total amount of x in a group r is given by 


X(r) = [U2N (waz. (12) 


r 


The corresponding “populations” are given by 
N(r) = f N (a) dz. (13) 


Suppose now that these groups will be segregated spatially. They 
will thus form separate communities of different sizes. 

Let x denote any special ability such as executive, business, liter- 
ary, etc., and consider the case where N(x) decreases with increasing 
x. The functions N(v) and X(r) may either decrease or increase 
with 7. It is clear that we cannot set n(7) proportional to N(r) if 
the latter increases, for the largest community will be composed of 
individuals with the smallest x. If, however, N(r) and X(r) de- 
crease with increasing 7, we may consider the following situation. 
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The r-th group of N (7) individuals may form a nucleus around which 
a number, n'(r), of individuals of the lowest x gather to perform 
any activity directed by the N (7) individuals, and thus form a com- 
munity of n(v) = N(r) + n'(r) individuals. We thus assume that 
the class of lowest x’s, namely, that lying in the interval x, — 0, is 
entirely passive. This class is also the most numerous one. The sim- 
plest assumption we may make about n’(r) is that it is proportional 
to X(7r). We then have 


n(r)=N(r) + aX(r). (14) 
If N(7r) << aX(7r), then we have approximately 
n(r) =aX(r). (15) 


Thus n(7) would be determined by N(x), and we may investigate 
what form of N(x) will give a prescribed n(r), for example, the ex- 
pression (1). 

While such a case presents some theoretical interest, yet it can 
hardly be applied to actual cases. For it implies that the active group 
of each community has a different range of x’s, the ranges for differ- 
ent cities never overlapping. 

We may consider a somewhat more realistic assumption which 
is free from the above shortcoming. Let the group N(1) form a nu- 
cleus around which there will be gathered n’(1) = aX (1) individuals, 
but let those n’(1) individuals be taken from all the N“’ = N — N(1) 
individuals that are left outside of the group N(1). Furthermore, let 
the contribution of individuals with a given x to aX(1) be propor- 
tional to the frequency with which those individuals occur. We have 


pei fON@ae. (16) 


The distribution function N(x) of the individuals left after the 
aX (1) individuals have been subtracted from N“ is, with the above 
assumption, equal to 


aX (1) 
N® (x2) = (1 - i N(x). (17) 


In other words, each class has lost a fraction aX(1)/N“™ of indi- 
viduals. 
Now the group whose x lies between x, and x, has only 


NO= . N® (x) dx (18) 


7 27, 


individuals, and their total x is equal to 
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x'(2) = a aN (x) dz. (19) 


This group will gather around it aX’(2) individuals, from the N‘ 
= N® — N©®) individuals left outside of the group. We have 


Ne = i ‘N®) (2) da. (20) 
0 


The distribution function of the remaining individuals is given by 


etsy ett > )N (x), (21) 





N®) 


and 
x'(3). = ie aN (2) da. (22) 


Thus we can consecutively calculate N, N®, N®, N®), as well as 
X (1), X’(2), X'(3), --- ete., and thus find n(7) from 


n(r) =N” + aX'(r). (23) 


This scheme inay be generalized further by considering that as 
the distribution junctions N(x) change step by step, so will the 
Set X, , %2,-++, X, change, according to equations developed previously 
(1). We shall have the interval 1 — x, determined by N“ (x) = N(z). 
Instead of x, — x., we shall use in (18) an interval x, — x’, , where 
x’, is determined by N(x), and so forth. 

The difficulty with actual calculation of such expressions lies in 
the circumstance that even the simplest forms of N (2) lead to trans- 
cendental equations for x, , %2,-:- , %,, which do not admit of closed 
solutions (1, 2). In order to get an idea as to how such expressions 
as (12), (13), (14), and (23) behave, we shall make a very crude 
approximation and consider all intervals 7; — 7;,, as equal to a small 


constant A: 
Li — Vin =A. (24) 


For very small values of A an approximate expression for X (Tr) 
and N (vr) is easily obtained. We may put approximately 


*1-(r-1)4 
X(r) = | «XN (x) dx = AxN (x), (25) 
1-rd 


and 
1-(r-1)A 


N(r) cas | N(x) dx = AN(z). (26) 
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Remembering that for the r-th group x is approximately equal. to 
1 — rA, we find 


X(r) =A(1—TrA)N(1-—7A), N(r)=AN(1- 74). (27) 


We shall consider here, as a theoretically interesting case, an 

approximate distribution function 
N(z2) = Az”, y>0, (28) 

which is suggested by Pareto’s law. The relation (28) cannot hold 
physically for x = 0. Most likely N(0) = 0, but it may also be that 
N(0) = Const. Except for exceedingly small values of x, the ap- 
proximation may be very good. The exact expression for N(«) should 
satisfy relation (11), which also determines the constant A . 

For very small values of 4 we have from (27) 


X(r) = AA(1 — 74)’; N(r) = AA(1 — 7A)”. (29) 
If X(r) is always to decrease with increasing 7, we must have 
O0<r<1. (30) 


In the following we shall always consider that the restriction (30) is 
satisfied. 

The exact expressions for X(7r) and N(7) are obtained from 
(12) and (138): 


A 
X(r) =5— {1 — @& — 4)" — [1 -rA}*); 
ee 
A (31) 
N(r) ms OS (T= & =~ Tpayp* — (2 — tay}. 
These expressions are to be used in either (14) or (15). 
Since, however, N(7) in this case increases with (7), equation 
(14) would have no meaning. 
We now shall derive an explicit form for expression (23) based 


on (24) and (28), and show that the difficulty vanishes in this case. 
We have 


tee ee aed | (32) 





ae 





8 es ed ai (33) 


Jo i= 





X'(1) =X(1) =A i x dz = (i= i- 2p"). oe 


—y» 
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Hence 
aX (1 
Ne (2) = (1 - 4 N(z) = 
(35) 
(1-(1-4)*](1-»), 
fl—~« ) Ag: 
Gi -4)""G —*#) 
x @) = f- tN (x) dz = 
ah en ), A i 
aacs er 2-v —y 
ne —[{(1 — A)?” — (1—24)?"]. 
ee eee >] 
Define 
i—/ 
= : 37 
a=ao—; (37) 
and 
ee ae 1—- (i -4)*” iz (1 — 4)*” — (1 — 24)*”" 7 
pee ee Tee 
(38) 
[1 — (r — 1) 4} — [1 — rda}*’” 
--(l-—a Ue 
(i — ra4)*” 
We shall now prove that in general 
N‘“) (2) =Af(r—1)2”; (39) 


A 
X'(r) — —-2){01 — @—1)4]” — (1 -1r4}**}. 


ae 


Expressions (39) and (40) hold for r = 1 and r = 2, as we have 
seen. We shall prove that if they hold for 7, they hold for r +1. 
From (39) we have 


*1-74 
Nw = J N(x) dx =—- f(r-1)(1—r4)', (41) 
( — 


aX (7) 
ernie) = (1 — “Tn N (2) dz = 
— {1 are ( a 1) Al? i gl _ rA\?~" (42) 
tes : f(r —1)Azr~. 





2-y (1 — rd) 


3ecause of (37) and (38), equation (42) may be written 
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N‘")) (x) =Af(r) ax”. (43) 
We have further, 


1-r4 
X(r+1l= f aN (2) dz, 
oJ 1-(74+1)4 
which, because of (43), may be written 





X'(r +1) = : 
a 


aed: 


SOT ~ 78s" ~ 1 - @ + 2a"). oS 


This proves (39) and (40). 
We have 





Nw = [oN @ara— ape B= Ga 
—[1-rd]~}. _ 
Hence, introducing (40) and (45) into (23): 
n(r) =H = 3yats — = tay? — US — Fp") 
(46) 


A 
+ ite ~1){[1 — (r -1) 4) — [1 — r4]}. 


‘ 


The expression f(n) simplifies considerably for very small values 
of A. Putting for any k, 





kA=y, (47) 
we have 
[i ~ @-~1)4)*— 12-84 = 1 -  — 4 - [1 - I. 
(48) 
For very small values of A , this is equal to 
d * 
—~A— (1—y)*”= (2—») A(L— y)” 
dy 
(49) 
= (2 — 9) A(1 — kA)”. 
Hence, because of (48), (49), and (37), 
1— (k-1) 4)? — [1 — kA” 
— )4) L L =1—a(l-—¥»)4; (50) 


(1 — kA)!" 
and, because of (38): 
f(r) = [1 —a(1 — ») 4)’. (51) 
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Introducing (51) into (40), and transforming the expression in 
braces of (40) according to (48) and (49), we have: 


X’(r) =AA(1— rAd)” [1-—a(1 — »).A4]"™. (52) 
Since physically we must have 
O<< 2 —at— +) 4 <2, (53) 


therefore, comparing (52) with (29), we see that X’(7r) decreases 
more rapidly with r than X (7). This should be physically so, because 
while X(r) refers to the group formed of all individuals who have an 
x between 1 — yd and1— (r —1)4, X'(*) refers to the group formed 
by individuals left within that interval after subtracting the amount 
which contributed the rv — 1 preceding n’s. Hence X(7r) > X’(7). 

It should be noticed that X (7) decreases with r less rapidly than 
1/r , but X'(”) decreases more rapidly than 1/r. 

By a similar procedure using (48), (49), and (51), we obtain 
from (45) for very small values of A: 


N‘) = AA(1 — rA)~ [1 —a(1 — ») A]™. (54) 
Introducing (52) and (54) into (23), we find: 
n(r) = AA[1 — a(1 — ») AJ" {(1 — 74)” + a(1 — r4)*”}. (55) 


The variation of »(*) with r is rather complicated. For small values 
of r the term [1 — a(1 — yv)A]** decreases more rapidly than 
(1 — rA)~’ increases. Therefore, for small values of 7 , N“” decreases, 
but less rapidly than X’(7). The quantity n(r) decreases also. Since, 
however, for 7 = 1/4, the term (1 — rA)~” becomes infinite, n(7) has 
a minimum for some value of r —7,, .. Since by definition 7 is the rank- 
order of decreasing sizes, such a situation would be physically absurd. 
This difficulty may be avoided by the following consideration. Equa- 
tion (55) is based on the approximation expressions (52) and (54), 
which cannot be applied for values of r that are close to 1/4. It 
must be remembered that r varies from 1 to 1/4 only. If the para- 
meters in equation (55) can be chosen so that the value 7, for which 
n(r) has a minimum is greater than 1/4 — 1, then the above diffi- 
culty will be avoided. We shall now prove that this can be done. 
Denote 
y=1l-a(l-—»v)4; O<y<1. (56) 


Equation (55) now becomes 


n(r) = AAy™ {(1 — 1A)” +a(1 — r4)*}. (57) 
We have 
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dn(r) 
ae = Ady" (1 — r4)-’[v4 (1 — rd)“ + log y 

—a(1—»)4+a(1— 1A) logy]. (58) 

The value 7, is defined by dn(r) /dr = 0 or 

vA + (1 — rd) [log y — a(1 — ») A] 
+a(1— 7A)? log y=0. _ 

Introduce the new variable 

2=1-—T4; 2z4a—-1-1,4. (60) 


We then have 
2m? a log (1/y) + 2m[log(1/y) +a(1—v)4] —v4=0. (61) 
Equation (61) gives 
@m = [ (1/2) log (1/y)]{— Tog (1/y) + a(1 — ») A] 


+ VY [log(1/y) + a(1 — v) A]? + 40/4 log (1/y)}. 


The positive sign must be taken before the square root because zm > 0. 
If we wish to have 7, < 1/4 — 1, then we must have 


ln > A. (63) 


Inequality (63) will be satisfied if y is made sufficiently small. To 
show this, we make use of (56) and write (62) thus: 


&m = [ (1/2) log (1/y)]{—Llog(1/y) +1—-—y] 


4(l1—y)» 
+. | [log (1/y) +1 — y]? + ——— log (1/y) . 


As y becomes very small, log (1/y) becomes very large. Thus we may 
neglect 1 — y in the expression in brackets, as well as neglect y as 
compared with 1. We then have: 


1 
@m = [1/2) log (1/y)]{- i" 


(62) 





(64) 








Ay al 
1—» log(1/y) | 


1/log(1/y) being now a very small quantity, we may expand the ex- 
pression under the square root sign, keeping only linear terms. We 
thus find 


(65) 





+ log(1/y) a + 
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2y 
@m = [ (1/2) log (1/y)] par (66) 


By making y sufficiently small, we can always satisfy inequality (63). 

But a small y means a sufficiently large a. Hence inequality (63) 
may be satisfied by taking a sufficiently large a, though not large 
enough to make » negative. 

Thus with a proper choice of a, n(7r) as given by (55) will be 
monotonically decreasing with 7, within the rangel =r=i1/4-1. 
We may now investigate under what conditions, if any, (7) will 
vary within a wide range approximately as 1/r, so as to satisfy re- 
lation (1). 

The general theory of the breaking up of a social group into 
classes, as developed in previous papers (1, 2), is based on the as- 
sumption that only such individuals associate with each other for 
whom the difference (x — «)* is less than a certain quantity 4%). We 
had as a criterion, 

(2’ — 2)? < A*,. (67) 


We shall now consider a different criterion which is perhaps some- 
what more realistic. We shall assume, namely, that it is not the dif- 
ference x’ — x, but the ratio x’/x , which determines whether or not 
two individuals associate with each other. The plausibility of such an 
assumption is suggested by the foilowing considerations. 

An individual with an income of $100,000 is likely to associate 
with another individual whose income is $75,000, but an individual 
with an income of $26,000 is not likely to associate with an individual 
having an income of $1,000. The difference is the same in both cases, 
but the ratios are different. Similarly, an executive or a politician 
who controls directly or indirectly 10,000 individuals will associate 
with another one who controls 6,000 individuals, but an executive 
having control over 5,000 individuals will not associate with a fore- 
man having control over 25 individuals. 

Instead of (67) we may now put 


(log x’ — log x)? < A,?, (68) 


and instead of equation (13) of (2), we shall determine x from the 
equation 


: f [ Cdoe xz’ — log x)? — A,?] N(x) N(x’) dadz’. (69) 


Equations of similar form will determine x, , x; , etc. 
We run again into the same difficulty as before, namely, the equa- 
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tions determining 2; are transcendental. We shall therefore again use 
a very rough approximation corresponding to (24) of the previous 
case. We shall put, namely, 


a=p, 0<p<1. (70) 


For N(a) we shall again use (28). 
We find now, with reference to (12) and (138), 


pr- A(l a b>”) 
X (7) =A Cal de = ————_ penis) : (71) 
Br 2-y 
Br-1 A (1 —B 1-v) 
N(r) =A ye a’ eae P (72) 


Now both X(r) and N(7) decrease monotonically with 7, so that 
equation (14) can be used, giving 
ones 1-v 1 sania 2-v 


= A (———__—  g-») (1) 
n(1r) ~. - ie 





sews) . (73) 


mee 


It is readily seen that n(7r) decreases much more rapidly than 1/r. 
We now calculate X’(r), N“ (x), and N“. We have 











"3 A(l oo ae 
. I =A 1-v I tuisindiclbintanticiniitanmtink 
X'(1) Jie sme rae (74) 
*B Ap’ 
NO=A Js ie PO (75) 
aX’ (1) 
N® (2) = (1 - ) N(z) 
N® 
- 99 ~ -*) 
=A(l-a ) x; (76) 
(2 — ») B-” 
*B 
X'(2) = J ene (x) dx 
Ga-s)-f”") Atl— PP”) 
=(l-a ) P. ‘iH 





(2 — ») B-” 2-y 
Define 
= -1 - -*) 
d=« . (78) 
2-y 
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and 
hi(y) = (1 — bp’) (1 — bf") (1 — 06") --- (1 — BB"). (79) 


We shall now prove that in general 


N(x) =Afi(r — 3) x; 





Ab 
X' (7) =——— far — 8) per, (80) 
a(l— y) 
We have from (80): 
. Br A 
NO= | NO (a)de=—— flr — 8) 6. (81) 
1) a 
Hence ; 
aX’ (7) 
New (a) = (1 — ——) Nt (a) = (1 — d6"*) NO (2) 
N“™ 
=Af,(r —2)z”, (82) 
and ; . 
: Br A(l = p>”) 
X'(r+1) = Jane (a) dx =f, (r — 2) ee 
a 
Ab 
= —_—_— f, (r — 2) po", (83) 
a(1— y) ‘ 
Since (81) holds for r = 2, it therefore holds for any 7. We also have 
= pr-3 A (1 hes y*) 


No = * N (4) dx = ot (r — 3) pra, (84) 
apes 


Introducing (80) and (81) into (23), we find 


a(e) =—— A A(r—3)peOn(1— per + BB). (85) 
a 


n(r) decreases monotonically with 7, but more rapidly than 1/7. 
We may consider a more general assumption, namely that 
varies from class to class. Denoting by /,, £2, --- , 6; a sequence of 
numbers such that 
o< fs <1, (86) 
we may put 


xi = B; Bo++ Bi. (87) 
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In that case 
» BiB2...Br-1 A (1 — B2-*) 
X(r) =A x” dx = ——————_ (, :- B,)?”. (88) 
BsB2...Br 2-—Yy 


X(r) decreases with 7, and by a proper choice of the sequence 
B, Bo +++ B; it may be made to decrease as 1/7. Equation (15) would 
then satisfy (1). A similar assumption may be studied for the more 
general case involving N‘”, X’(r), and equation (23). This shall be 


done elsewhere. 
The author is indebted to Professor Alston S. Householder for a 


critical discussion of this paper. 
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A REVIEW 
“The behavior selected for study was the familiar and everyday practice of 
parking automobiles in city streets. It was selected . . . because it is a form of 


behavior that occurs frequently, can be observed easily, be measured in terms of 
frequency and duration of occurrence, can be modified by posting signs indicat- 
ing whether or not parking is legally allowed and if so for how long, and above 
all it can be observed without the slightest interference with it. By securing the 
cooperation of the local police department, the authors were able to compare park- 
ing behavior under three main conditions: (1) when parking was unrestricted, 
(2) restricted or prohibited but ‘no tagging’ for over-parking, (3) restricted and 
enforced by ‘tagging.’ ” 

Findings are interpreted in terms of “learning theory,” which “conceives of 
the overt behavior of an individual, acting either alone or in a group of other 
individuals, as behavior which he has learned or is learning to perform. His be- 
havior is determined by the relation between four factors—drive, cue, response, 
and reward—which relation he has learned, or is learning.” 

The authors show “that changes in the frequency and duration of parking 
behavior of an unselected sample of a population can be predicted by the us of 
their empirical formulas. The question that remains unanswered is whether or 
not these empirical formulas can be derived deductively from any set of basic 
postulates.” 

STEUART HENDERSON BRITT 
Washington, D. C. 


217 











